Transformers Library

Fri, 08 Mar 2024 07:38:10 GMT

 Properties

Key Value
Identifier transformers-library
Name Transformers Library
Type Topic
Creation timestamp Fri, 08 Mar 2024 07:38:10 GMT
Modification timestamp Thu, 29 Aug 2024 08:57:08 GMT

Hugging Face is best known for its Transformers library, an open-source library that provides a collection of pre-trained models for various natural language processing tasks. The library allows easy access to pre-trained models and provides tools for fine-tuning on specific tasks. These models support common tasks in different modalities, such as:

  • Natural Language Processing: text classification, named entity recognition, question answering, language modelling, summarization, translation, multiple choice, and text generation.
  • Computer Vision: image classification, object detection, and segmentation.
  • Audio: automatic speech recognition and audio classification.
  • Multimodal: table question answering, optical character recognition, information extraction from scanned documents, video classification, and visual question answering.

Pipelines

The pipeline() makes it simple to use any model from the Hub for inference on any language, computer vision, speech, and multimodal tasks. Even if you don’t have experience with a specific modality or aren’t familiar with the underlying code behind the models, you can still use them for inference with the pipeline().

The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering.

There are two categories of pipeline abstractions to be aware about:

  1. The pipeline() which is the most powerful object encapsulating all other pipelines.
  2. Task-specific pipelines are available for audio, computer vision, natural language processing, and multimodal tasks.

The pipeline works in steps:

  1. Normalization: normalization is, in a nutshell, a set of operations you apply to a raw string to make it less random or “cleaner”.
  2. Pre-tokenization: pre-tokenization is the act of splitting a text into smaller objects that give an upper bound to what your tokens will be at the end of training.
  3. Model: once the input texts are normalized and pre-tokenized, the Tokenizer applies the model on the pre-tokens. The role of the model is to split your “words” into tokens, using the rules it has learned. It’s also responsible for mapping those tokens to their corresponding IDs in the vocabulary of the model.
  4. Post-processing: post-processing is the last step of the tokenization pipeline, to perform any additional transformation to the Encoding before it’s returned, like adding potential special tokens.

Tokenizers

A tokenizer is in charge of preparing the inputs for a model. The library contains tokenizers for all the models. In practical terms, a tokenizer transforms text into a numerical representation that the relevant model understands.

Models

A machine learning model (or, deep learning model) is a program that can find patterns or make decisions from a previously unseen dataset. For example, in natural language processing, machine learning models can parse and correctly recognize the intent behind previously unheard sentences or combinations of words.

Back to top

 Context

 Topic resources