Creative interactive views of any dataset.
Create interactive views of any dataset.
Website | Quickstart | Docs | Contributing | Discord | Blogpost
pip install meerkat-ml
Next Steps. Check out our Getting Started page and our documentation to start building with Meerkat.
Meerkat is an open-source Python library that helps users visualize, explore, and annotate any dataset. It is especially useful when processing unstructured data types (e.g. free text, PDFs, images, video) with machine learning models.
Here are four principles that inform Meerkat's design.
(1) Low overhead. With four lines of Python, start interacting with any dataset.
import meerkat as mk
df = mk.from_csv("paintings.csv")
df["image"] = mk.files("image_url")
df
(2) Diverse data types. Visualize and annotate almost any data type in Meerkat interfaces: text, images, audio, video, MRI scans, PDFs, HTML, JSON.
(3) "Intelligent" user interfaces. Meerkat makes it easy to embed machine learning models (e.g. LLMs) within user interfaces to enable intelligent functionality such as searching, grouping and autocomplete.
df["embedding"] = mk.embed(df["img"], engine="clip")
match = mk.gui.Match(df,
against="embedding",
engine="clip"
)
sorted_df = mk.sort(df,
by=match.criterion.name,
ascending=False
)
gallery = mk.gui.Gallery(sorted_df)
mk.gui.html.div([match, gallery])
(4) Declarative (think: Seaborn), but also infinitely customizable and composable. Meerkat visualization components can be composed and customized to create new interfaces.
plot = mk.gui.plotly.Scatter(df=plot_df, x="umap_1", y="umap_2",)
@mk.gui.reactive
def filter(selected: list, df: mk.DataFrame):
return df[df.primary_key.isin(selected)]
filtered_df = filter(plot.selected, plot_df)
table = mk.gui.Table(filtered_df, classes="h-full")
mk.gui.html.flex([plot, table], classes="h-[600px]")
Meerkat is being built by Machine Learning PhD students in the Hazy Research lab at Stanford. We're excited to build for a future where models will make it easier for teams to sift and reason through large volumes of unstructtured data effortlessly.
Please reach out to kgoel [at] cs [dot] stanford [dot] edu, eyuboglu [at] stanford [dot] edu, and arjundd [at] stanford [dot] edu
if you would like to use Meerkat for a project, at your company or if you have any questions.