Details
-
Work Item
-
Status: closed
-
Minor
-
Resolution: Done
-
None
-
All
-
DGA Sprint 8 (4 to 21 June), DGA Sprint 9 (25/6 to 12/7)
-
GreenHopper Ranking:0|i1zqgb:
-
9223372036854775807
-
Small
-
3
Description
As a Data Preparation user, in Runtime Convergence mode,
I want to run my preparation on a remote engine with a simple function (upper case), with a dataset as destination
In order to benefit from dataset common features (e.g. output connectivity, Quality, Trust Score, reuse it as input of a preparation or a pipeline)
[Backend API part: Provide an API to transform a TDP recipe into TCK configuration + Generate and run a pipeline with a simple TCK function ]
Objective: Use a simple migrated TCK function (from connectors-ee) during the preparation run
Why?
In Track 1, the pipeline is built and run using the Data Prep processor & preparation definition. It is therefore not using the TCK function on the Remote Engine.
We need to switch to the new approach for the available function(s) (starting with Uppercase) to be able to validate the migrated TCK functions.
How?
- Provide a new endpoint in prepV2 service (to be called at a later stage by Pipeline Designer DataPrepprocessor + tDataPrepRun Studio component)
- Definition and mock =>
TDP-10144/{preparationId}/runs/recipe - Mapping from TDP recipe (TDP steps: functions & parameters) to a list of TCK functions with parameters (migrated TCK functions)
- Instanciation of mapping for a simple function (Uppercase, Lowercase, Concat)
- Definition and mock =>
- Check and adapt mapping from TCK function to TDP recipe (add function)
- Integrate input + TCK config (result of mapping of the TDP recipe into a list of TCK functions) + output to call Pipeline API
Acceptance criteria
Scenario: Run a preparation with 2 uppercase steps (with UI)
Given a tenant with Runtime Convergence activated,
a dataset dataset1 (ex: S3 or local connection) with at least two text columns
and a dataset dataset2 created with name DATASET_OUTPUT_FULLRUN (ex: S3 or local connection)
and a preparation based on dataset1
When user applies "change to upper case" function on columnA - without creating a new column
And applies"change to upper case" function on columnB - creating a new column
And presses the "Export" button
Then the preparation result is exported to dataset2 DATASET_OUTPUT_FULLRUN with the correct output (ex: check the sample)
- columnA values are in upper case
- a new column contains columnB values in upper case
Out of scope:
- Interactive mode with TCK => only run covered (previously known as export fullrun) - in interactive mode, legacy TDP pipeline is still used
- Handling of statistics and semantic types => only simple functions covered
- Function completeness => start with at least Uppercase TCK function
Attachments
Issue Links
- has to be done before
-
TDP-9977 [RunConv][Backend] Integrate migrated TCK functions - simple [batch 1]
-
- closed
-
- is related to
-
TDP-10144 [RunConv] Provide an API mock for preparation payload
-
- closed
-
-
TDP-8122 PLAYGROUND | FUNCTIONS | Apply a function on scope column
-
- New
-
-
TDP-8193 PLAYGROUND | TRANSVERSAL | Export FullRun
-
- New
-
-
TDP-9905 [RunConv][Backend] Run a preparation with different dataset types as source/destination
-
- closed
-
- mentioned in
-
Page Loading...