Uploaded image for project: 'Talend Component Kit'
  1. Talend Component Kit
  2. TCOMP-317

Improve performance of DI components to be close to existing component framework

Apply templateInsert Lucidchart Diagram
    XMLWordPrintable

Details

    • All
    • 0.16.0, 0.18.0
    • Small
    • 13

    Description

      2 jobs used, schema of 30 columns with only string:
      1) tRowgenerator -> tFileOutputDelimited
      2) tFileInputDelimited -> tjavarow

      Data produced by job1 is used by job2.

      In 6.2.1, for job2:

      100 000 lines:
      2016-10-17 20:24:14;launcher_fileInputDelimited;begin;;
      2016-10-17 20:24:17;launcher_fileInputDelimited;end;success;2933
      2016-10-17 20:24:23;launcher_fileInputDelimited;begin;;
      2016-10-17 20:24:26;launcher_fileInputDelimited;end;success;2833
      2016-10-17 20:24:31;launcher_fileInputDelimited;begin;;
      2016-10-17 20:24:34;launcher_fileInputDelimited;end;success;2813
      
      200 000 lines:
      2016-10-17 20:22:54;launcher_fileInputDelimited;begin;;
      2016-10-17 20:22:58;launcher_fileInputDelimited;end;success;4493
      
      400 000 lines:
      2016-10-17 20:21:09;launcher_fileInputDelimited;begin;;
      2016-10-17 20:21:15;launcher_fileInputDelimited;end;success;6333
      

      In 6.3:

      100 000 lines:
      2016-10-17 20:27:41;launcher_fileInputDelimited;begin;;
      2016-10-17 20:28:33;launcher_fileInputDelimited;end;success;51822
      2016-10-17 20:30:31;launcher_fileInputDelimited;begin;;
      2016-10-17 20:31:22;launcher_fileInputDelimited;end;success;51332
      

      It's 16 times longer.
      I didn't ran the 200 000 lines and 400 000 lines use case.

      job1(tFileOutput) show an increase of 50%.

      The issue was already detected on salesforce component but the effect was reduced compared to network latency.

      Origin: one hypothesis is the multiple conversion step from/to avro.

      Attachments

        1. fileinputdelimited.png
          152 kB
          Ivan Gonchar
        2. fileoutputdelimited.png
          157 kB
          Ivan Gonchar
        3. Performance Report.xlsx
          14 kB
          Ivan Gonchar

        Issue Links

          Activity

            People

              igonchar Ivan Gonchar
              pteyssier pierre teyssier
              Dmytro Chmyga (Inactive), Ivan Gonchar
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 3 days, 6 hours Original Estimate - 3 days, 6 hours
                  3d 6h
                  Remaining:
                  Time Spent - 1 week, 4 days, 5 hours Remaining Estimate - 6 hours, 30 minutes
                  6h 30m
                  Logged:
                  Time Spent - 1 week, 4 days, 5 hours Remaining Estimate - 6 hours, 30 minutes
                  1w 4d 5h