Comet for structured data problems¶

Most Machine Learning problems fall into the realm of structured data. Comet integrates with a variety of frameworks that were designed to solve this type of problem. For a full list of these integrations, check our Integrations page.

The following end-to-end example looks at the binary classification problem of Churn Prediction.

Churn Prediction is a common use case in machine learning and involves predicting whether or not a customer will stop using your product. In building such a model, you typically:

Explore the data.
Train and evaluate a baseline model.
Visualize the model predictions and metrics.

We will showcase some of the Comet features that can help you solve strucutured data problems. These include:

Comet's Dataframe Profiling integration with Sweetviz
The Confusion Matrix
Custom Panels for Visualization

We will use the IBM Telco Customer Churn Dataset from Kaggle to build and evaluate our model.

Create an Experiment¶

The first step in tracking our run is to create an Experiment:

import comet_ml
experiment = comet_ml.Experiment(
    api_key="<Your API Key>",
    project_name="<Your Project Name>"
)

Note

There are alternatives to setting the API key programatically. See more here.

Track and explore the data¶

Next, we're going to download the data for this example using Comet Artifacts.

This snippet downloads the Churn dataset to our current working directory.

artifact = experiment.get_artifact('team-comet-ml/telco-churn-dataset:latest')
artifact.download('./)

Using Artifacts allows us to track the exact version of input data used in this experiment. Artifacts can be consumed or produced by an experiment, and can be viewed in the Assets and Artifacts tab in the single Experiment view.

Now that we've fetched our data, we will use Comet's integration with Sweetviz to profile the dataset and log the resulting report to our experiment.

import sweetviz

df = pd.read_csv('./telco-churn-dataset.csv', index_col=0)
report = sweetviz.analyze(df, target_feat='Churn Label')
report.log_comet(experiment)

The interactive report is logged under the HTML tab in the Experiment view.

Train and evaluate a baseline model¶

We will use scikit-learn's RandomForestClassifier as our baseline model. Since we are using scikit-learn as our framework, Comet automatically logs the model parameters without requiring additional instrumentation code.

Note

Check out our Integrations section for more details about using Comet's automatic logging capabilities with your preferred machine learning framework.

from sklearn.ensemble import RandomForestClassifier

clf = RandomForestClassifier()
clf.fit(X_train, y_train)

The logged parameters can be found under the Hyperparameters tab in the Experiment view.

Log metrics from a classification report¶

Now that we've trained our baseline model, let's compute some metrics to assess model performance. We're going to create a classification report using scikit-learn, and log the resulting metrics, f1, precision, and recall to Comet.

The report is a dictionary with the following structure:

{
    "0": {
        "precision": 0.9997591522157996,
        "recall": 1.0,
        "f1-score": 0.9998795616042394,
        "support": 4151,
    },
    "1": {
        "precision": 1.0,
        "recall": 0.9993215739484396,
        "f1-score": 0.999660671869698,
        "support": 1474,
    },
    "accuracy": 0.9998222222222222,
    "macro avg": {
        "precision": 0.9998795761078998,
        "recall": 0.9996607869742198,
        "f1-score": 0.9997701167369687,
        "support": 5625,
    },
    "weighted avg": {
        "precision": 0.999822265039606,
        "recall": 0.9998222222222222,
        "f1-score": 0.9998222027653569,
        "support": 5625,
    },
}

It is a nested dictionary, where each value is a dictionary containing the specific metric names and their values and each key represents information about how the metric was calculated or whether it is a class-specific metric. The only exception is accuracy, which is a simple key-value pair.

We would like to compute these metrics for each dataset split (train, test) and log the values in this dictionary in a way that preserves all the information provided by the keys.

We can use the prefix option in the experiment.log_metrics method to appropriately add the dictionary keys to our model metric names. Also, to keep the evaluation code concise, we will use Comet's Experiment Context to append the appropriate context to our model metric names.

from sklearn.metrics import classification_report


def log_classification_report(y_true, y_pred):
    report = classification_report(y_true, y_pred, output_dict=True)
    for key, value in report.items():
        if key == "accuracy":
            experiment.log_metric(key, value)
        else:
            experiment.log_metrics(value, prefix=f"{key}")


with experiment.train():
    log_classification_report(y_train, clf.predict(X_train))

with experiment.test():
    log_classification_report(y_test, clf.predict(X_test))

The snippet logs the metric data to Comet in the following way:

Notice how we have preserved all the contextual information around the metric.

Log a confusion matrix¶

Now that we have logged a few metrics, let's see where our classifier is having difficulties, by logging a Confusion Matrix and inspecting the misclassified examples.

Comet's Confusion Matrix lets you log samples of data along with the model predictions, so that you can identify the specific features your model is having trouble with. Since we have a large number of features in our data, we will only log a subset of them for this example.

def index_to_example(index):
    return X_test.iloc[index, :][["CLTV", "Monthly Charges", "Total Charges"]].to_json()


experiment.log_confusion_matrix(
    y_test.tolist(),
    clf.predict(X_test).tolist(),
    index_to_example_function=index_to_example,
)

The logged matrix can be found under the Confusion Matrix tab in the Experiment view:

Try it out!¶

We have prepared a Colab Notebook that you can use to run the example yourself.

More examples¶

Other typical end-to-end examples showcase how Comet is used to handle the challenges presented by natural language processing (NLP) and image data.

Feb. 9, 2024