Signal data
7 important questions on Signal data
How can signals serve as data for classifiers?
- Different activities, or speech, cause different signals.
- Signals from different instances of the same activity ought to have some similarities.
- features that capture similarities between signals from the same activities and differences between signals from different activities, can be used to train classifier models
Why can't we use the raw signals as features to train classifiers on?
- Signals are not very constant, no two measurements of the same signal will be identical. Some common problems with signals are the following:
- Time dilation: The signal is slower or faster
- shift: due to a different starting point has the entire signal shifted in time
- scaling: the signal is less or more forcefull
Since you cannot use raw signals as features for classifiers, what can you do to still built classifiers on signals?
- You need to extract time-invariant features.
- The values of the signals can be tranformed into a histogram. This histogram summarises the signal, irrespective of it's scale and it's timeframe.
- These summarising histograms can be used to calculate statistics on the signals that the histograms summarise.
- for example; mean, sd, variance, min, max, skewness, kurtosis, range, number of modes, etc.
- As shown in the picture, even though the signals are different. Their histograms look equal, indicating that they do a good job in capturing the essence of the features.
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding
What is kurtosis of a histogram?
- Kurtosis is a statistical measure that describes the shape of the probability distribution of a random variable. It is often used to understand the "tailedness" or the degree to which the data in a distribution deviates from a normal distribution. In the context of a histogram, kurtosis can help you assess the peakedness and the presence of outliers in your data.
- Positive kurtosis indicates heavy tails, while negative kurtosis indicates light tails. A value of zero suggests that the distribution is mesokurtic and approximately follows a normal distribution.
What is entropy of a histogram?
- Entropy, in the context of a histogram or probability distribution, is a measure of the randomness, uncertainty, or disorder within the data. It quantifies how much information is needed to describe or predict the values within a dataset or probability distribution
- It's higher when there is more randomness or disorder in the data and lower when the data is more certain or predictable. An entropy of 0 indicates perfect certainty or no uncertainty, typically occurring when all values in the distribution are the same.
Next to time-invariant features from histograms, what is another way of extracting features from signals?
- Spectral features are characteristics extracted from signal data in the frequency domain. They provide valuable information about the frequency components and patterns present in a signal.
- Frequency is the number of occurrences of a repeating event per unit time, e.g. The number of peaks per second.
- Signals are composed of multiple sinus waves combined. By analysing what type of sinus waves the signal is composed of, you can extract features from the signals which describe to what extend different frequencies are present in the signal.
How can the caret package be used to train models on?
- The caret package can be used to train classifiers and regression models on.
- first, specify the way you want your model trained with the trainControl function:
- trcntr <- caret::trainControl(method = "cv", number = 5)
- this instructs the following code to use cross validation with 5 folds
- then train your model with the instructed training method
- caret::train(y ~ ., data = data, method = "multinomial", trControl = trcntr)
- this function fits a multinomial logistic regression classifier with 5 fold cross-validation
The question on the page originate from the summary of the following study material:
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding