Uploaded image for project: 'Talend Component Kit'
  1. Talend Component Kit
  2. TCOMP-317

Improve performance of DI components to be close to existing component framework

Apply templateInsert Lucidchart Diagram
    XMLWordPrintable

Details

    • All
    • 0.16.0, 0.18.0
    • Small
    • 13

    Description

      2 jobs used, schema of 30 columns with only string:
      1) tRowgenerator -> tFileOutputDelimited
      2) tFileInputDelimited -> tjavarow

      Data produced by job1 is used by job2.

      In 6.2.1, for job2:

      100 000 lines:
      2016-10-17 20:24:14;launcher_fileInputDelimited;begin;;
      2016-10-17 20:24:17;launcher_fileInputDelimited;end;success;2933
      2016-10-17 20:24:23;launcher_fileInputDelimited;begin;;
      2016-10-17 20:24:26;launcher_fileInputDelimited;end;success;2833
      2016-10-17 20:24:31;launcher_fileInputDelimited;begin;;
      2016-10-17 20:24:34;launcher_fileInputDelimited;end;success;2813
      
      200 000 lines:
      2016-10-17 20:22:54;launcher_fileInputDelimited;begin;;
      2016-10-17 20:22:58;launcher_fileInputDelimited;end;success;4493
      
      400 000 lines:
      2016-10-17 20:21:09;launcher_fileInputDelimited;begin;;
      2016-10-17 20:21:15;launcher_fileInputDelimited;end;success;6333
      

      In 6.3:

      100 000 lines:
      2016-10-17 20:27:41;launcher_fileInputDelimited;begin;;
      2016-10-17 20:28:33;launcher_fileInputDelimited;end;success;51822
      2016-10-17 20:30:31;launcher_fileInputDelimited;begin;;
      2016-10-17 20:31:22;launcher_fileInputDelimited;end;success;51332
      

      It's 16 times longer.
      I didn't ran the 200 000 lines and 400 000 lines use case.

      job1(tFileOutput) show an increase of 50%.

      The issue was already detected on salesforce component but the effect was reduced compared to network latency.

      Origin: one hypothesis is the multiple conversion step from/to avro.

      Attachments

        Issue Links

          Activity

            People

              igonchar Ivan Gonchar
              pteyssier pierre teyssier
              Dmytro Chmyga (Inactive), Ivan Gonchar
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 3 days, 6 hours Original Estimate - 3 days, 6 hours
                  3d 6h
                  Remaining:
                  Time Spent - 1 week, 4 days, 5 hours Remaining Estimate - 6 hours, 30 minutes
                  6h 30m
                  Logged:
                  Time Spent - 1 week, 4 days, 5 hours Remaining Estimate - 6 hours, 30 minutes
                  1w 4d 5h