Writing Parquets to Azure (Blob Storage / Data Lake Gen2 Storage)
Writing parquets with Azure with Java In this guide I will show you how to write parquet files from a vanilla java code. In order to do that we will work with AvroParquertWriter<GenericRecord> and with Path and Configuration from the hdfs libraries. Why Java? It is very common when working with parquets to work with Apache Spark, But in many data flow architectures we don’t want to use Spark for microservices, for example, a service which parse small chunks of data and saves them to blob storage....