Categories
BLOG

transformer pipe

What is a Transformer Pipeline?

A Transformer pipeline describes the flow of data from origin systems to destination systems and defines how to transform the data along the way.

Transformer pipelines are designed in Pipeline Designer or in Transformer and executed by Transformer .

You can include the following stages in Transformer pipelines:

Origins An origin stage represents an origin system. A pipeline can include multiple origin stages. If you use more than one origin in a pipeline, you must use a Join processor to join the data read by the origins. Each Join processor can join data from two input streams. Processors You can use multiple processor stages to perform complex transformations on the data. Destinations

A destination stage represents a destination system. A pipeline can include multiple destination stages.

When you develop a pipeline, you can also include development stages to provide sample data and generate errors to test error handling.

A Transformer pipeline describes the flow of data from origin systems to destination systems and defines how to transform the data along the way.

sklearn.pipeline .Pipeline¶

Pipeline of transforms with a final estimator.

Sequentially apply a list of transforms and a final estimator. Intermediate steps of the pipeline must be ‘transforms’, that is, they must implement fit and transform methods. The final estimator only needs to implement fit. The transformers in the pipeline can be cached using memory argument.

The purpose of the pipeline is to assemble several steps that can be cross-validated together while setting different parameters. For this, it enables setting parameters of the various steps using their names and the parameter name separated by a ‘__’, as in the example below. A step’s estimator may be replaced entirely by setting the parameter with its name to another estimator, or a transformer removed by setting it to ‘passthrough’ or None .

Read more in the User Guide .

New in version 0.5.

List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an estimator.

memory str or object with the joblib.Memory interface, default=None

Used to cache the fitted transformers of the pipeline. By default, no caching is performed. If a string is given, it is the path to the caching directory. Enabling caching triggers a clone of the transformers before fitting. Therefore, the transformer instance given to the pipeline cannot be inspected directly. Use the attribute named_steps or steps to inspect estimators within the pipeline. Caching the transformers is advantageous when fitting is time consuming.

verbose bool, default=False

If True, the time elapsed while fitting each step will be printed as it is completed.

Attributes named_steps Bunch

Dictionary-like object, with the following attributes. Read-only attribute to access any step parameter by user given name. Keys are step names and values are steps parameters.

Convenience function for simplified pipeline construction.

sklearn.pipeline .Pipeline¶ Pipeline of transforms with a final estimator. Sequentially apply a list of transforms and a final estimator. Intermediate steps of the pipeline must be