Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
All
-
Small
Description
Reproduce the issue
Create an SQL table (in Snowflake, for example)
create or replace table TEST_DATE( date DATE ); INSERT INTO TEST_DATE VALUES ('1989-07-07'), ('1989-07-07'), ('1989-07-07'), (null), (null);
Run a simple read/write pipeline (composed of TableNameInput and S3Output components) that writes to an Avro file on S3. The schema of the output file is correct, however, the values are not.
The schema
{ "type" : "record", "name" : "Record_1_2789471843289431261", "namespace" : "org.talend.sdk.component.schema.generated", "fields" : [ { "name" : "DATE", "type" : [ "null", { "type" : "long", "logicalType" : "timestamp-millis", "talend.component.DATETIME" : "true" } ] } ] }
The data
{"DATE":{"long":615772800000}} {"DATE":{"long":615772800000}} {"DATE":{"long":615772800000}} {"DATE":{"long":-1}} {"DATE":{"long":-1}}
Expected data
{"DATE":{"long":615772800000}} {"DATE":{"long":615772800000}} {"DATE":{"long":615772800000}} {"DATE":null} {"DATE":null}
Root cause analysis
Everything started with this commit, 3 years ago, when - for some reason - null dates are stored as a value of -1.
When getting a date field, a conversion is done to return null instead of -1.
However, this ad-hoc approach of storing dates has a limitation that developers should not forget to implement the "-1 => null" logic in all the implementations of Record.
And this is the problem here; when converting an AvroRecord to IndexedRecord, this conversion is forgotten, and thus, -1 is returned instead.
This can be fixed in two ways:
- Add the converion logic for this case, and do not forget to do it for future use cases.
- Fix it once and for all and store the null dates as null (technically, do not store them) like what is done for other types.
The two solutions haven been implemented in the PR.
Attachments
Issue Links
- is duplicated by
-
TCOMP-1811 Record Builder API should work with Object not primitive
- New
-
TCOMP-1810 Record interface is not coherent for ZonedDateTime
- Rejected
- is related to
-
TCOMP-1810 Record interface is not coherent for ZonedDateTime
- Rejected