-
New Feature
-
Resolution: Fixed
-
Major
-
None
-
All
-
GreenHopper Ranking:0|i2httn:
-
9223372036854775807
-
Small
-
To be defined
Description of need
On studio, some tck processor like "dataprep" will update the schema of incoming records. The kind of modification (rename a field, add one ...) can be known by processor with schema of incoming record, configuration of processor itself and outgoing branch.
So, the need is to have a "Button" on processor to determine schema of produces records; and also that studio schema propagation use this service if exist.
Supported method signatures for @DiscoverSchemaExtended annotation:
/** * * @param incomingSchema the schema of the input flow * @param conf the configuration of the processor (not a @Dataset) * @param branch the name of the output flow for which the the computed schema is expected (FLOW, MAIN, REJECT, etc.) * @return */ @DiscoverSchemaExtended("full") public Schema guessMethodName(final Schema incomingSchema, final @Option("configuration") procConf, final String branch) {...} @DiscoverSchemaExtended("incoming_schema") public Schema guessMethodName(final Schema incomingSchema, final @Option procConf) {...} @DiscoverSchemaExtended("branch") public Schema guessMethodName(final @Option("configuration") procConf, final String branch) {...} @DiscoverSchemaExtended("minimal") public Schema guessMethodName(final @Option procConf) {...}
Annotation value:
The annotation action name should match the connector's name.
Example:
@Data @Processor(family = "TaCoKitGuessSchema", name = "outputDi") public static class StudioProcessor implements Serializable { @Option private ProcConf configuration; @ElementListener public Object next(Record in, Record out) { return null; } }
In service class:
@Service public static class StudioProcessorService implements Serializable { @DiscoverSchemaExtended("outputDi") public Schema discoverProcessorSchema(final Schema incomingSchema, @Option("configuration") final ProcConf conf, final String branch) { ... }
In Studio, by default, the guess schema will first search for an action named like the component. If the search fails, it will try to find a default one.
On cloud environments, finding the action name via the component-server is more tricky, so I really recommend to name action to the target connector.
- is related to
-
TCOMP-2235 DiscoverSchema - Be able to give the input/output full configuration to the service
- Done
-
TCOMP-2311 DiscoverSchemaExtended validation is too strict
- Done
-
TCOMP-2283 Handle Dynamic, Document and Byte column types in guess schema
- Done