Everything you need to know about the AI platform

24 March 2024 (20:01)

585

A monitor in a dark room displaying a hand-drawn image of four hands drawing on a scroll

Hugging Face is a platform for viewing, sharing, and showcasing machine learning models, datasets, and related work. It aims to make Neural Language Models (NLMs) accessible to anyone building applications powered by machine learning. Many popular AI and machine-learning models are accessible through Hugging Face, including LLaMA 2, an open source language model that Meta developed in partnership with Microsoft.

Hugging Face is a valuable resource for beginners to get started with machine-learning models. You don’t need to pay for any special apps or programs to get started. You only need a web browser to browse and test models and datasets on any device, even on budget Chromebooks.

Contents

1 What is Hugging Face?
- - - 1.0.0.1 What is Constitutional AI?
2 What can you do with Hugging Face?
- 2.1 The Transformers model library
- 2.2 Using the datasets library
  - - 2.2.0.1 What are large language models?
3
- 3.1 Using pipelines to perform tasks
4 How to get started with Hugging Face
- 4.1 Collaborating with Hugging Face
- 4.2 Related

What is Hugging Face?

Hugging Face provides machine-learning tools for building applications. Notable tools include the Transformers model library, pipelines for performing machine-learning tasks, and collaborative resources. It also offers dataset, model evaluation, simulation, and machine learning libraries. Hugging Face can be summarized as providing these services:

Hosting open source pre-trained machine learning models.
Easy access to these models through various environments (for example, Google Colab or a Python virtual environment).
Tools for adjusting machine-learning models.
An API that offers a user-friendly interface for performing various machine-learning tasks.
Community spaces for collaborating, sharing, and showcasing work.

Hugging Face receives funding from companies including Google, Amazon, Nvidia, Intel, and IBM. Some of these companies have created open source models accessible through Hugging Face, like the LLaMA 2 model mentioned at the beginning of this article.

What is Constitutional AI?

And is it the answer to safely deploying AI?

The number of models available through Hugging Face can be overwhelming, but it’s easy to get started. We walk you through everything you need to know about what you can do with Hugging Face and how to create your own tools and applications.

What can you do with Hugging Face?

The core of Hugging Face is the Transformers model library, dataset library, and pipelines. Understanding these services and technologies gives you everything you need to use Hugging Face’s resources.

The Transformers model library

The Transformers model library is a library of open source transformer models. Hugging Face has a library of over 495,000 models grouped into data types called modalities. You can use these models to perform tasks with pipelines, which we explain later in this article.

Some of the tasks you can perform through the Transformers model library are:

Object Detection
Question Answering
Summarization
Text Generation
Translation
Text-to-speech

A complete list of these tasks can be seen on the Hugging Face website, categorized for easy searching.

Within these categories are numerous user-created models to choose from. For example, Hugging Face currently hosts over 51,000 models for Text Generation.

If you aren’t sure how to get started with a task, Hugging Face provides in-depth documentation on every task. These docs include use cases, explanations of model and task variants, relevant tools, courses, and demos. For example, the demo on the Text Generation task page uses the Zephyr language models to complete models. You’ll refer to the model for instructions on how to use it for the task.

screenshot of text generation on hugging face website

These tools make experimenting with models easy. While some are pre-trained with data, you’ll need datasets for others, which is where the datasets library comes into play.

Using the datasets library

The Hugging Face datasets library is suitable for all machine-learning tasks offered within the Hugging Face model library. Each dataset contains a dataset viewer, a summary of what’s included in the dataset, the data size, suggested tasks, data structure, data fields, and other relevant information.

What are large language models?

Large language models (LLMs) are the basis for AI chatbots and much more. Here’s what’s going on behind the scenes

For example, the Wikipedia dataset contains cleaned Wikipedia articles of all languages. It has all the necessary documentation for understanding and using the dataset, including helpful tools like a data visualization map of the sample data. Depending on what dataset you access, you may see different examples.

Using pipelines to perform tasks

Models and datasets are the power behind performing tasks from Hugging Face, but pipelines make it easy to use these models to complete tasks.

Hugging Face’s pipelines simplify using models through an API that cuts out using abstract code. You can provide a pipeline with multiple models by specifying which one you want to use for specific actions. For example, you can use one model for generating results from an input and another for analyzing them. This is where you’ll need to refer to the model page you used for the results to interpret the formatted results correctly.

Hugging Face has a full breakdown of the tasks you can use pipelines for.

How to get started with Hugging Face

Now you have an understanding of the models, datasets, and pipelines provided by Hugging Face, you’re ready to use these assets to perform tasks.

You only need a browser to get started. We recommend using Google Colab, which lets you write and execute Python code in your browser. It provides free access to computing resources, including GPUs and TPUs, making it ideal for basic machine-learning tasks. Google Colab is easy to use and requires zero setup.

After you’ve familiarized yourself with Colab, you’re ready to install the transformer libraries using the following command:

!pip install transformers

Then check it was installed correctly using this command:

import transformers

You’re now ready to dive into Hugging Face’s libraries. There are a lot of places to start, but we recommend Hugging Face’s introductory course, which explains the concepts we outlined earlier in detail with examples and quizzes to test your knowledge.

Collaborating with Hugging Face

Collaboration is a huge part of Hugging Face, allowing you to discuss models and datasets with other users. Hugging Face encourages collaboration through a discussion forum, a community blog, Discord, and classrooms.

Models and datasets on Hugging Face also have their own forums where you can discuss errors, ask questions, or suggest use cases.

Machine learning and AI are daunting for beginners, but platforms like Hugging Face provide a great way to introduce these concepts. Many of the popular models on Hugging Face are large language models (LLMs), so familiarize yourself with LLMs if you plan to use machine-learning tools for text generation or analysis.

Report

Game / Application Name

Your Email: *

Issue: *

Deadpool and Wolverine: everything we know so far