neptune.ai | Experiment tracking

Take interactive tour of a public Neptune project

import neptune

run = neptune.init_run()

run["parameters"] = params
run["dataset/train_version"].track_files("s3://datasets")

def any_module_function_or_hook(run):
    run["train/accuracy"].append(acc)
    run["valid/misclassified_images"].append(img)
    run["your/metadata/structure"].append(any_metadata)

Log ML metadata

Log any model metadata from anywhere in your ML pipeline. Get started in 5 minutes

Add a snippet to any step of your ML pipeline once. Decide what and how you want to log. Run a million times

Any framework
Any metadata type
From anywhere in your ML pipeline

Any framework

Any code
Lightning
Keras
PyTorch
scikit-learn
LightGBM
XGBoost
Optuna
Apache Airflow

See all integrations

import neptune

# Connect to Neptune and create a run
run = neptune.init_run()

# Log hyperparameters
run["parameters"] = {
    "batch_size": 64,
    "dropout":0.5,
    "optimizer": {"type":"SGD", "learning_rate": 0.001},
}
# Log dataset versions
run["data/train_version"].track_files("train/images")

# Log the training process
for iter in range(100):
    run["train/accuracy"].append(accuracy)

# Log test metrics and charts
run["test/f1_score"] = test_score
run["test/confusion_matrix"].upload(fig)

# Log model weights and versions
run["model/weights"].upload("my_model.pkl")

# Stop logging to your run
run.stop()

from lightning.pytorch.loggers import NeptuneLogger

neptune_logger = NeptuneLogger()

trainer = Trainer(max_epochs=10, logger=neptune_logger)

trainer.fit(my_model, my_dataloader)

from neptune.integrations.tensorflow_keras import NeptuneCallback

run = neptune.init_run()
neptune_cbk = NeptuneCallback(run=run)

model.fit(callbacks=[neptune_callback], ...)

from neptune.integrations.pytorch import NeptuneLogger

run = neptune.init_run()
neptune_callback = NeptuneLogger(run=run, model=model)

import neptune.integrations.sklearn as npt_utils

run = neptune.init_run()

parameters = {
    "n_estimators": 120,
    "learning_rate": 0.12,
    "min_samples_split": 3,
    "min_samples_leaf": 2,
}

gbc = GradientBoostingClassifier(**parameters)
gbc.fit(X_train, y_train)

run["cls_summary"] = npt_utils.create_classifier_summary(
    gbc, X_train, X_test, y_train, y_test
)

from neptune.integrations.lightgbm import NeptuneCallback, create_booster_summary

run = neptune.init_run()
neptune_callback = NeptuneCallback(run=run)

params = {
    "boosting_type": "gbdt",
    "objective": "multiclass",
    "num_class": 10,
    "metric": ["multi_logloss", "multi_error"],
    "num_leaves": 21,
    "learning_rate": 0.05,
    "max_depth": 12,
}

# Train the model
gbm = lgb.train(
    params,
    lgb_train,
    num_boost_round=200,
    valid_sets=[lgb_train, lgb_eval],
    valid_names=["training", "validation"],
    callbacks=[neptune_callback],
)

run["lgbm_summary"] = create_booster_summary(
    booster=gbm,
    log_trees=True,
    list_trees=[0, 1, 2, 3, 4],
    log_confusion_matrix=True,
    y_pred=y_pred,
    y_true=y_test,
)

from neptune.integrations.xgboost import NeptuneCallback

run = neptune.init_run()
neptune_callback = NeptuneCallback(run=run, log_tree=[0, 1, 2, 3])

params = {
    "eta": 0.7,
    "gamma": 0.001,
    "max_depth": 9,
    "objective": "reg:squarederror",
    "eval_metric": ["mae", "rmse"],
}

xgb.train(
    params=params,
    dtrain=dtrain,
    num_boost_round=num_round,
    evals=evals,
    callbacks=[neptune_callback],
)

import neptune.integrations.optuna as npt_utils

run = neptune.init_run()
neptune_callback = npt_utils.NeptuneCallback(run)

study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=20, callbacks=[neptune_callback])

from neptune_airflow import NeptuneLogger

with DAG(...) as dag:
    def task(**context):
        neptune_logger = NeptuneLogger()
        ...
        # Get the Neptune run from the current task
        # and log metadata manually
        with neptune_logger.get_run_from_context(
            context=context, log_context=log_context
        ) as run:
            ...
            run["checkpoint"].upload_files("my_model.h5")

Any metadata type

Metrics
Parameters
Dataset and model versions
Images
Interactive plots
Videos
Hardware (GPU, CPU, Memory)
Code state

See all supported metadata types

run["score"] = 0.97

for epoch in range(100):
    run["train/accuracy"].append(acc)

run["model/parameters"] = {
    "lr": 0.2,
    "optimizer": {"name": "Adam", "momentum": 0.9},
}

run["train/images"].track_files("./datasets/images")

run["matplotlib-fig"].upload(fig)

for name in misclassified_images_names:
    run["misclassified_images"].append(File("misclassified_image.png"))

run["visuals/altair-fig"].upload(File.as_html(fig))

run["video"].upload("/path/to/video-file.mp4")

run = neptune.init_run(capture_hardware_metrics=True)

run = neptune.init_run(
    source_files=["**/*.py", "config.yaml"],
    dependencies="infer",
)

From anywhere in your ML pipeline

Multinode pipelines
Distributed computing
Log during or after execution
Log offline and sync when you are back online

Log from many pipeline nodes to the same run

export NEPTUNE_CUSTOM_RUN_ID="SOME ID"

Log from multiple machines to the same run

export NEPTUNE_CUSTOM_RUN_ID="SOME ID"

# Open finished run "SUN-123"
run = neptune.init_run(with_id="SUN-123")

# Download model
run["train/model_weights"].download()

# Continue logging
run["test/accuracy"].append(0.68)

Script

run = neptune.init_run(mode="offline")

Console

neptune sync

Organize experiments

Organize and display experiments and model metadata however you want

Organize logs in a fully customizable nested structure. Display model metadata in user-defined dashboard templates

Nested metadata structure
Custom dashboards
Table views

Nested metadata structure

Custom dashboards

Table views

Compare results

Search, debug, and compare experiments, datasets, and models

Visualize training live in the Neptune web app. See how different parameters and configs affect the results. Optimize models quicker

Compare
Search, sort, and filter
Monitor live
Group by

Compare

Learning curves
Parameters
Images
Datasets

See live examples

Search, sort, and filter

Monitor live

Group by

Reproduce experiments

Version datasets and experiments for easier reproducibility

Save dataset versions, environment configs, parameters, code, metrics, and model binaries for every experiment you run

Version datasets
Save hyperparameters
Track environment and code
Save metrics and results
Log and version model binaries

Version datasets

# Local file
run["train_dataset"].track_files("./datasets/train.csv")

# Local directory
run["train/images"].track_files("./datasets/images")

# S3 storage
run["train/eval_images"].track_files("s3://datasets/eval_images")

Save hyperparameters

run["parameters/epoch_nr"] = 5
run["parameters/batch_size"] = 32

PARAMS = {
    "optimizer": "sgd",
    "model": {
        "dense": 512,
        "dropout": 0.5,
        "activation": "relu",
    },
}

run["parameters"] = PARAMS

Track environment and code

run = neptune.init_run(
    project="common/showroom",
    dependencies="requirements.txt",
	source_files=[
        "model.py",
        "preparation.py",
        "exploration.ipynb",
        "**/*.sh",
        "config.yaml",
    ],
)

Save metrics and results

# log score
run["score"] = 0.97
run["test/acc"] = 0.97

# log learning curves
for epoch in range(100):
		...
		run["train/accuracy"].append(acc)
		run["train/loss"].append(loss)
		run["metric"].append(metric)

# log diagnostic charts
run["test/confusion_matrix"].upload(fig_cm)
run["test/precision_recall_curve"].upload(fig_prc)

Log and version model binaries

run["model/binary"].upload("my_model.pkl")
run["model/version"].track_files("my_model.pkl")

Share results

Share and collaborate on experiment results and models across the org

Have a single place where your team can see the results and access all models and experiments

Send a link
Query API
Manage users and projects
Add your entire org

Send a link

Query API

#Fetch metadata from a single run
run = neptune.init_run(
    with_id="DET-135",
    mode="read-only",
)
    
batch_size = run["parameters/batch_size"].fetch()
losses = run["train/loss"].fetch_values()
md5 = run["dataset/train"].fetch_hash()
run["trained_model"].download("models/")


#Fetch runs from a project in bulk
project = neptune.init_project()

runs_df = project.fetch_runs_table(
    query="f1_score:float > 0.8",
    columns=["sys/running_time", "f1_score"],
    sort_by="f1_score",
    state="inactive",
).to_pandas()

Manage users and projects

Add your entire org

Integrate

Integrate with any MLOps stack

See all integrations & supported tools

I’ve been mostly using Neptune just looking at the UI which I have, let’s say, kind of tailored to my needs. So I added some custom columns which will enable me to easily see the interesting parameters and based on this I’m just shifting over the runs and trying to capture what exactly interests me.

Wojciech Rosiński CTO at ReSpo.Vision

Gone are the days of writing stuff down on google docs and trying to remember which run was executed with which parameters and for what reasons. Having everything in Neptune allows us to focus on the results and better algorithms.

Andreas Malekos Head of Artificial Intelligence at Continuum Industries

Neptune is aesthetic. Therefore we could simply use the visualization it was generating in our reports.

We trained more than 120.000 models in total, for more than 7000 subproblems identified by various combinations of features. Due to Neptune, we were able to filter experiments for given subproblems and compare them to find the best one. Also, we stored a lot of metadata, visualizations of hyperparameters’ tuning, predictions, pickled models, etc. In short, we were saving everything we needed in Neptune.

Patryk Miziuła Senior Data Scientist at deepsense.ai

The way we work is that we do not experiment constantly. After checking out both Neptune and Weights and Biases, Neptune made sense to us due to its pay-per-use or usage-based pricing. Now when we are doing active experiments then we can scale up and when we’re busy integrating all our models for a few months that we scale down again.

Viet Yen Nguyen CTO at Hypefactors

See all case studies

Get started

1

Sign up

2

pip install neptune

3

import neptune

run = neptune.init_run()
run["params"] = {"lr": 0.1, "dropout": 0.4}
run["test_accuracy"] = 0.84

Try live notebook

Resources

Code examples, videos, projects gallery, and other resources.

Frequently asked questions

Yes, you can deploy Neptune on-premises and other answers

Read more about our deployment options here.

But in short, yes, you can deploy Neptune on your on-prem infrastructure or in your private cloud.

It is a set of microservices distributed as a Helm chart that you deploy on Kubernetes.

If you don’t have your own Kubernetes cluster deployed, our installer will set up a single-node cluster for you.

As per infrastructure requirements, you need a machine with at least 8 CPUs, 32GB RAM, and 1TB SSD storage.

Read the on-prem documentation if you’re interested, or talk to us (support@neptune.ai) if you have questions.

If you have any trouble, our deployment engineers will help you all the way.
Yes, you can just reference datasets that sit on your infrastructure or in the cloud.

For example, you can have your datasets on S3 and just reference the bucket.

run[“train_dataset”].track_files(“s3://datasets/train.csv”)

Neptune will save the following metadata about this dataset:
- version (hash),
- location (path),
- size,
- folder structure, and contents (files)
Neptune never uploads the dataset, just logs the metadata about it.

You can later compare datasets or group experiments by dataset version in the UI.
Short version. People choose Neptune when:
- They don’t want to maintain infrastructure (including autoscaling, backups etc.),
- They keep scaling their projects (and get into thousands of runs),
- They collaborate with a team (and want user access, multi-tenant UI etc.).
For the long version, read this full feature-by-feature comparison.
Short version. People choose Neptune when:
- They want to pay a reasonable price for the experiment tracking solution,
- They want a super flexible tool (customizable logging structure, dashboards, works great with time series ML),
- They want a component for experiment tracking and model versioning, NOT an end-to-end platform (WandB has HPO, orchestration, model deployment, etc. We integrate with best-in-class tools in the space).
For the long version, read this full feature-by-feature comparison.
It depends on what “model monitoring” you mean.

As we talk to teams, it seems that “model monitoring” means six different things to three different people:
- (1) Monitor model performance in production: See if the model performance decays over time, and you should re-train it
- (2) Monitor model input/output distribution: See how the distribution of input data, features, and predictions distribution change over time?
- (3) Monitor model training and re-training: See learning curves, trained model predictions distribution, or confusion matrix during training and re-training
- (4) Monitor model evaluation and testing: log metrics, charts, prediction, and other metadata for your automated evaluation or testing pipelines
- (5) Monitor hardware metrics: See how much CPU/GPU or Memory your models use during training and inference
- (6) Monitor CI/CD pipelines for ML: See the evaluations from your CI/CD pipeline jobs and compare them visually
So when looking at tooling landscape and Neptune:
- Neptune does (3) and (4) really well, but we see teams use it for (5) and (6). We don’t do (1) and (2), but we integrate with tools for that.
- Prometheus + Grafana is really good at (5), but people use it for (1) and (2).
- WhyLabs or Arize are really good at (1) and (2).

Manage all your model metadata in a single place

Start logging for free See pricing

Deploy Neptune on your infrastructure

See also: neptune.ai demo [20min]

How Brainly avoids workflow bottlenecks with automated tracking

How Neptune gave Waabi organization-wide visibility on experiment data

How Elevatus uses Neptune to check experiment results in under 1 minute

Building a Machine Learning Platform

Learnings From Building the ML Platform at Mailchimp