Labeled data

This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Labeled data" – news · newspapers · books · scholar · JSTOR (May 2017) (Learn how and when to remove this template message)

Machine learning and data mining

Problems Classification Clustering Regression Anomaly detection AutoML Association rules Reinforcement learning Structured prediction Feature engineering Feature learning Online learning Semi-supervised learning Unsupervised learning Learning to rank Grammar induction
Supervised learning (classification • regression) Decision trees Ensembles Bagging Boosting Random forest k-NN Linear regression Naive Bayes Artificial neural networks Logistic regression Perceptron Relevance vector machine (RVM) Support vector machine (SVM)
Clustering BIRCH CURE Hierarchical k-means Expectation–maximization (EM) DBSCAN OPTICS Mean-shift
Dimensionality reduction Factor analysis CCA ICA LDA NMF PCA t-SNE
Structured prediction Graphical models Bayes net Conditional random field Hidden Markov
Anomaly detection k-NN Local outlier factor
Artificial neural networks Autoencoder Deep learning DeepDream Multilayer perceptron RNN LSTM GRU) Restricted Boltzmann machine SOM Convolutional neural network U-Net
Reinforcement learning Q-learning SARSA Temporal difference (TD)
Theory Bias-variance dilemma Computational learning theory Empirical risk minimization Occam learning PAC learning Statistical learning VC theory
Machine-learning venues NIPS ICML ML JMLR ArXiv:cs.LG
Glossary of artificial intelligence Glossary of artificial intelligence
Related articles List of datasets for machine-learning research Outline of machine learning
Machine learning portal
v t e

Labeled data is a group of samples that have been tagged with one or more labels. Labeling typically takes a set of unlabeled data and augments each piece of that unlabeled data with meaningful tags that are informative. For example, labels might indicate whether a photo contains a horse or a cow, which words were uttered in an audio recording, what type of action is being performed in a video, what the topic of a news article is, what the overall sentiment of a tweet is, whether the dot in an x-ray is a tumor, etc.

Labels can be obtained by asking humans to make judgments about a given piece of unlabeled data (e.g., "Does this photo contain a horse or a cow?"), and are significantly more expensive to obtain than the raw unlabeled data.

After obtaining a labeled dataset, machine learning models can be applied to the data so that new unlabeled data can be presented to the model and a likely label can be guessed or predicted for that piece of unlabeled data.^[1]

References[edit]

^ Johnson, Leif. "What is the difference between labeled and unlabeled data?", Stack Overflow, 4 October 2013. Retrieved on 13 May 2017. This article incorporates text by lmjohns3 available under the CC BY-SA 3.0 license.

References[edit]

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Interaction

Tools

Print/export

Languages