Details
-
Bug
-
Resolution: Fixed
-
Critical
-
0.16.0
-
None
-
All
-
0.16.0, 0.18.0
-
Small
-
13
Description
2 jobs used, schema of 30 columns with only string:
1) tRowgenerator -> tFileOutputDelimited
2) tFileInputDelimited -> tjavarow
Data produced by job1 is used by job2.
In 6.2.1, for job2:
100 000 lines: 2016-10-17 20:24:14;launcher_fileInputDelimited;begin;; 2016-10-17 20:24:17;launcher_fileInputDelimited;end;success;2933 2016-10-17 20:24:23;launcher_fileInputDelimited;begin;; 2016-10-17 20:24:26;launcher_fileInputDelimited;end;success;2833 2016-10-17 20:24:31;launcher_fileInputDelimited;begin;; 2016-10-17 20:24:34;launcher_fileInputDelimited;end;success;2813 200 000 lines: 2016-10-17 20:22:54;launcher_fileInputDelimited;begin;; 2016-10-17 20:22:58;launcher_fileInputDelimited;end;success;4493 400 000 lines: 2016-10-17 20:21:09;launcher_fileInputDelimited;begin;; 2016-10-17 20:21:15;launcher_fileInputDelimited;end;success;6333
In 6.3:
100 000 lines: 2016-10-17 20:27:41;launcher_fileInputDelimited;begin;; 2016-10-17 20:28:33;launcher_fileInputDelimited;end;success;51822 2016-10-17 20:30:31;launcher_fileInputDelimited;begin;; 2016-10-17 20:31:22;launcher_fileInputDelimited;end;success;51332
It's 16 times longer.
I didn't ran the 200 000 lines and 400 000 lines use case.
job1(tFileOutput) show an increase of 50%.
The issue was already detected on salesforce component but the effect was reduced compared to network latency.
Origin: one hypothesis is the multiple conversion step from/to avro.
Attachments
Issue Links
There are no Sub-Tasks for this issue.