Details
-
Bug
-
Status: Done
-
Minor
-
Resolution: Fixed
-
None
-
None
-
All
-
Small
Description
I have an avro payload (see attached file customers_ordres.avro) and i try to convert it to TCK record. I use for that the org.talend.components.common.stream.input.avro.AvroToRecord tool.
The input avro payload contains a "customer" array with records. The schema of these records is the following:
{ "type": "record", "name": "customer", "fields": [{ "name": "custid", "type": ["null", "string"], "default": null }, { "name": "name", "type": ["null", "string"], "default": null }, { "name": "address", "type": ["null", { "type": "record", "name": "address", "fields": [{ "name": "street", "type": ["null", "string"], "default": null }, { "name": "city", "type": ["null", "string"], "default": null }, { "name": "zipcode", "type": ["null", "string"], "default": null }] }], "default": null }, { "name": "rating", "type": ["null", "int"], "default": null }] }
Some customers don't have all the information: for example the last customer doesn't have a zipcode.
I noticed that when i convert this avro payload to a TCK record, some nested records have incomplete TCK schema. For example, for the last customer without zipcode, the generated TCK schema linked to the corresonding TCK record do not have a zipcode field. This TCK schema should have a zipcode field since the corresponding avro schema has a zipcode field.
It could be possible to change a little bit the code of AvroToRecord in order to generate complete TCK schema:
- Line 152: add inferSchema(record) when calling recordBuilderFactory.newRecordBuilder
private void buildArrayField(org.apache.avro.Schema.Field field, Collection<?> value, Record.Builder recordBuilder, Entry entry) { final org.apache.avro.Schema arraySchema = AvroHelper.getUnionSchema(field.schema()); final org.apache.avro.Schema arrayInnerType = arraySchema.getElementType(); final Collection<?> objectArray; switch (arrayInnerType.getType()) { case RECORD: objectArray = ((Collection<GenericRecord>) value).stream() .map(record -> avroToRecord(record, arrayInnerType.getFields(), recordBuilderFactory.newRecordBuilder(inferSchema(record)))) .collect(Collectors.toList()); break;
- line 188: add inferSchema((GenericRecord) value) when calling recordBuilderFactory.newRecordBuilder
protected void buildField(org.apache.avro.Schema.Field field, Object value, Record.Builder recordBuilder, Entry entry) { String logicalType = field.schema().getProp(AVRO_LOGICAL_TYPE); org.apache.avro.Schema.Type fieldType = AvroHelper.getFieldType(field); switch (fieldType) { case RECORD: recordBuilder.withRecord(entry, avroToRecord((GenericRecord) value, ((GenericRecord) value).getSchema().getFields(), recordBuilderFactory.newRecordBuilder(inferSchema((GenericRecord) value)))); break;
Attachments
Issue Links
- prerequisite of
-
TDI-46439 Exception raised when trying to convert an avro record to tck record
-
- Done
-