PyTorch Glossary - Complete Guide to PyTorch Terms

Showing 12 of 51 functions (51 total)

torch.tensor()

TENSOR FUNDAMENTALS

Creates a tensor from data like lists, arrays, or scalars. It's the most common way to create tensors in PyTorch and the foundation for all neural network operations.

Syntax:

torch.tensor(data, dtype=None, device=None, requires_grad=False)

Use Cases:

Converting Python lists/arrays to PyTorch tensors for model input

torch.Tensor

TENSOR FUNDAMENTALS

The main tensor class in PyTorch. Represents a multi-dimensional array that can store data and gradients, supporting both CPU and GPU operations with automatic differentiation.

Syntax:

torch.Tensor(*sizes) or torch.Tensor(data)

Use Cases:

Base class for all tensor operations in neural networks

torch.zeros()

TENSOR FUNDAMENTALS

Creates a tensor filled with zeros. Essential for initializing tensors with known values, commonly used for padding, masking, and weight initialization in neural networks.

Syntax:

torch.zeros(*size, dtype=None, device=None, requires_grad=False)

Use Cases:

Initializing weight matrices and bias vectors

torch.ones()

TENSOR FUNDAMENTALS

Creates a tensor filled with ones. Useful for initialization, creating masks, and mathematical operations where you need a tensor of all ones.

Syntax:

torch.ones(*size, dtype=None, device=None, requires_grad=False)

Use Cases:

Creating bias initialization in neural networks

torch.randn()

TENSOR FUNDAMENTALS

Creates a tensor with random values from standard normal distribution (mean=0, std=1). Essential for weight initialization and adding noise in neural networks.

Syntax:

torch.randn(*size, dtype=None, device=None, requires_grad=False)

Use Cases:

Initializing neural network weights (Xavier/He initialization)

torch.arange()

TENSOR FUNDAMENTALS

Creates a 1D tensor with evenly spaced values within a given range. Similar to Python's range() but returns a tensor, useful for indexing and creating sequences.

Syntax:

torch.arange(start=0, end, step=1, dtype=None, device=None)

Use Cases:

Creating position indices for positional encoding in transformers

torch.cat()

TENSOR FUNDAMENTALS

Concatenates tensors along a specified dimension. Essential for combining embeddings, stacking sequences, and building complex tensor structures in neural networks.

Syntax:

torch.cat(tensors, dim=0)

Use Cases:

Combining outputs from multiple attention heads

torch.stack()

TENSOR FUNDAMENTALS

Stacks tensors along a new dimension. Unlike concat, it creates a new dimension and requires all tensors to have identical shapes. Essential for batching and creating higher-dimensional tensors.

Syntax:

torch.stack(tensors, dim=0)

Use Cases:

Creating batches from individual samples

torch.squeeze()

TENSOR FUNDAMENTALS

Removes dimensions of size 1 from a tensor. Essential for reshaping tensors and removing unnecessary singleton dimensions that can cause broadcasting issues.

Syntax:

torch.squeeze(input, dim=None)

Use Cases:

Removing batch dimensions of size 1 from model outputs

torch.unsqueeze()

TENSOR FUNDAMENTALS

Adds a dimension of size 1 at the specified position. Essential for broadcasting operations and preparing tensors for operations that require specific dimensionality.

Syntax:

torch.unsqueeze(input, dim)

Use Cases:

Adding batch dimensions to single samples

torch.matmul()

TENSOR OPERATIONS

Performs matrix multiplication between tensors. Supports broadcasting and handles various dimensional cases automatically. The go-to function for most matrix operations in neural networks.

Syntax:

torch.matmul(input, other)

Use Cases:

Computing attention scores in transformer models

torch.bmm()

TENSOR OPERATIONS

Performs batch matrix multiplication on 3D tensors. Specifically designed for batched operations where you have multiple matrices to multiply in parallel.

Syntax:

torch.bmm(input, mat2)

Use Cases:

Multi-head attention computations in transformers