Uploaded image for project: 'Talend Component Kit'
  1. Talend Component Kit
  2. TCOMP-1904

Delegate Avro record in AvroRecord seems to be invalid

Apply templateInsert Lucidchart DiagramXMLWordPrintable

    • All
    • Small

      This is related to the MapLang processor development in Pipeline Designer.

      We have a use case where the MQL interpreter generates a record containing an array of array of record. The MQL interpreter uses the RecordBuilderFactory API in order to create the TCK schema and record.
      But the generated TCK record seems invalid. It is actually an instance of AvroRecord which encapsulates an instance of GenericData$Record. But this Avro record does not contain GenericData$Array but simple Collections. And these collections does not contain GenericData$Record but AvroRecord instances. So we get an avro record which contains TCK record, which seems invalid.

      In order to reproduce easily this issue, i provide the following simple code example:

      		// get RecordBuilderFactory
      		AvroRecordBuilderFactoryProvider recordBuilderFactoryProvider = new AvroRecordBuilderFactoryProvider();
      		System.setProperty("talend.component.beam.record.factory.impl", "avro");
      		RecordBuilderFactory recordBuilderFactory = recordBuilderFactoryProvider.apply("test");
      		// customer record schema
      		org.talend.sdk.component.api.record.Schema.Builder schemaBuilder = recordBuilderFactory.newSchemaBuilder(Type.RECORD);
      		Entry nameEntry = recordBuilderFactory.newEntryBuilder().withName("name")
      				.withNullable(true).withType(Type.STRING).build();
      		Entry ageEntry = recordBuilderFactory.newEntryBuilder().withName("age")
      				.withNullable(true).withType(Type.INT).build();
      		Schema customerSchema = schemaBuilder.withEntry(nameEntry).withEntry(ageEntry).build();
      		// record 1
      		Builder recordBuilder = recordBuilderFactory.newRecordBuilder(customerSchema);
      		recordBuilder.withString("name", "Tom Cruise");
      		recordBuilder.withInt("age", 58);
      		Record record1 = recordBuilder.build();
      		// record 2
      		recordBuilder = recordBuilderFactory.newRecordBuilder(customerSchema);
      		recordBuilder.withString("name", "Meryl Streep");
      		recordBuilder.withInt("age", 63);
      		Record record2 = recordBuilder.build();
      		// list 1
      		Collection<Record> list1 = new ArrayList<Record>();
      		list1.add(record1);
      		list1.add(record2);
      		// record 3
      		recordBuilder = recordBuilderFactory.newRecordBuilder(customerSchema);
      		recordBuilder.withString("name", "Client Eastwood");
      		recordBuilder.withInt("age", 89);
      		Record record3 = recordBuilder.build();
      		// record 4
      		recordBuilder = recordBuilderFactory.newRecordBuilder(customerSchema);
      		recordBuilder.withString("name", "Jessica Chastain");
      		recordBuilder.withInt("age", 36);
      		Record record4 = recordBuilder.build();
      		// list 2
      		Collection<Record> list2 = new ArrayList<Record>();
      		list2.add(record3);
      		list2.add(record4);
      		// main list
      		Collection<Object> list3 = new ArrayList<Object>();
      		list3.add(list1);
      		list3.add(list2);
      		// schema of sub list
      		schemaBuilder = recordBuilderFactory.newSchemaBuilder(Type.ARRAY);
      		Schema subListSchema = schemaBuilder.withElementSchema(customerSchema).build();
      		// main record
      		recordBuilder = recordBuilderFactory.newRecordBuilder();
      		Entry entry = recordBuilderFactory.newEntryBuilder().withName("customers")
      				.withNullable(true).withType(Type.ARRAY).withElementSchema(subListSchema).build();
      		recordBuilder.withArray(entry, list3);
      		Record record = recordBuilder.build();
      

      The produced Record is like that:

      AvroRecord{delegate={"customers": [[AvroRecord{delegate={"name": "Tom Cruise", "age": 58}}, AvroRecord{delegate={"name": "Meryl Streep", "age": 63}}], [AvroRecord{delegate={"name": "Client Eastwood", "age": 89}}, AvroRecord{delegate={"name": "Jessica Chastain", "age": 36}}]]}}
      

      We can see that delegate Avro record contains TCK records.

            emmanuel_g emmanuel gallois
            timbault Tony Imbault
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: