Uploaded image for project: 'Talend Component Kit'
  1. Talend Component Kit
  2. TCOMP-2132

Optimisation for preparation

Apply templateInsert Lucidchart Diagram
    XMLWordPrintable

Details

    • Work Item
    • Resolution: Fixed
    • Major
    • 1.46.0
    • None
    • schema-record

    Description

      Preparation function spend lot of time with building new Record with new Schema.
      Optimisation could be made on TCK Record & Record.Builder

      Working track.
      getEntry
      Record.BuilderImpl contains a Map<String, Entry> that is only instanciated when a schema is provided and which is not used in public function (let the client to parse all entries before finding the right one).
      So, a function like "Entry getEntry(String name)" could be welcomed to help.

      withNewSchema
      Builder.withNewSchema use code like :

              newSchema.getAllEntries()
                      .filter(e -> Objects.equals(schema.getEntry(e.getName()), e))
      

      Knowing that schema.getEntry(name) function scan a list before finding the right Entry. This mean that is schema contains N fields and new Schema M, it will make NxM comparisons. (By example, if schema contains 50 fields, and new schema 51 (add one field), it will make 2550 comparison).
      Puting schema entries in a Map for example will reduce it to MxLog(N) for a treeMap and would be statictically more efficient with a HashMap.
      (this function is not used by processing connector)

      First performance test
      For 100 000 records with 100 fields each, with uppercase processor on 1 columns (adding new col)
      (average for 4 try on each case)

      Current TCK Optimized TCK
      23561 9127

      More than more than 62% time saved.

      Attachments

        Issue Links

          Activity

            People

              pteyssier pierre teyssier
              clesaec Christophe LeSaec
              Christophe LeSaec, emmanuel gallois, Fabien Desiles
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: