> For the complete documentation index, see [llms.txt](https://visual-python.gitbook.io/docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://visual-python.gitbook.io/docs/machine-learning/2.-data-split.md).

# 2. Data Split

<figure><img src="/files/f4g8kOsvNFA4Xx4LxaXh" alt="" width="344"><figcaption></figcaption></figure>

1. Click on ***Data Split*** in the ***Machine Learning*** category.

<figure><img src="/files/Cle9tmy19JsruXRouou8" alt="" width="563"><figcaption></figcaption></figure>

2. ***Input Data***: Choose whether the target data is included in the input data. If it is, select ***Feature Data*** and ***Target Data*** separately. You can also select specific columns from one dataset using the funnel icon.
3. ***Test Size***: Select the percentage of input data to use for testing purposes.
4. ***Random State***: Generate the same random state, ensuring consistent data splits each time. (If not set, data will be randomly split differently each time.)
5. ***Shuffle***: Shuffle the data randomly to prevent the model from relying on the order of the data, thereby reducing bias and improving generalization performance.
6. ***Stratify***: Maintain class ratios when splitting the data to prevent over-representation of certain classes (Classification).
7. ***Allocate to***: Assign variable names to the split data.
8. ***Code View***: Preview the code that will be output.
9. ***Run***: Execute the code.
