/chapter: L-Amplitude-And-Pitch-Tracking / CSOUND Español

CSOUND Español

INTRODUCCIÓN
PREFACIO
CÓMO USAR ESTE MANUAL
ACERCA DE ESTE LANZAMIENTO
CRÉDITOS
01 CUESTIONES BÁSICAS
A. AUDIO DIGITAL
B. ALTURA Y FRECUENCIA
C. INTENSIDADES
D. ALEATORIEDAD
02 INICIO RÁPIDO
A. CORRIENDO CSOUND
B. SINTÁXIS DE CSOUND
C. CONFIGURACIÓN MIDI
D. AUDIO EN VIVO
E. SALIDA A UN ARCHIVO
03 EL LENGUAJE DE CSOUND
A. PASADAS DE INICIALIZACIÓN Y EJECUCIÓN
B. LOCAL AND GLOBAL VARIABLES
C. CONTROL STRUCTURES
D. FUNCTION TABLES
E. ARRAYS
F. LIVE EVENTS
G. USER DEFINED OPCODES
H. MACROS
I. FUNCTIONAL SYNTAX
04 SOUND SYNTHESIS
A. ADDITIVE SYNTHESIS
B. SUBTRACTIVE SYNTHESIS
C. AMPLITUDE AND RING MODULATION
D. FREQUENCY MODULATION
E. WAVESHAPING
F. GRANULAR SYNTHESIS
G. PHYSICAL MODELLING
H. SCANNED SYNTHESIS
05 SOUND MODIFICATION
A. ENVELOPES
B. PANNING AND SPATIALIZATION
C. FILTERS
D. DELAY AND FEEDBACK
E. REVERBERATION
F. AM / RM / WAVESHAPING
G. GRANULAR SYNTHESIS
H. CONVOLUTION
I. FOURIER ANALYSIS / SPECTRAL PROCESSING
K. ATS RESYNTHESIS
L. AMPLITUDE AND PITCH TRACKING
06 SAMPLES
A. RECORD AND PLAY SOUNDFILES
B. RECORD AND PLAY BUFFERS
07 MIDI
A. RECEIVING EVENTS BY MIDIIN
B. TRIGGERING INSTRUMENT INSTANCES
C. WORKING WITH CONTROLLERS
D. READING MIDI FILES
E. MIDI OUTPUT
08 OTHER COMMUNICATION
A. OPEN SOUND CONTROL
B. CSOUND AND ARDUINO
09 CSOUND IN OTHER APPLICATIONS
A. CSOUND IN PD
B. CSOUND IN MAXMSP
C. CSOUND IN ABLETON LIVE
D. CSOUND AS A VST PLUGIN
10 CSOUND FRONTENDS
CSOUNDQT
CABBAGE
BLUE
WINXOUND
CSOUND VIA TERMINAL
WEB BASED CSOUND
11 CSOUND UTILITIES
CSOUND UTILITIES
12 CSOUND AND OTHER PROGRAMMING LANGUAGES
A. THE CSOUND API
B. PYTHON INSIDE CSOUND
C. PYTHON IN CSOUNDQT
D. LUA IN CSOUND
E. CSOUND IN iOS
F. CSOUND ON ANDROID
G. CSOUND AND HASKELL
H. CSOUND AND HTML
13 EXTENDING CSOUND
EXTENDING CSOUND
OPCODE GUIDE
OVERVIEW
SIGNAL PROCESSING I
SIGNAL PROCESSING II
DATA
REALTIME INTERACTION
INSTRUMENT CONTROL
MATHS, PYTHON/SYSTEM, PLUGINS
APPENDIX
METHODS OF WRITING CSOUND SCORES
GLOSSARY
LINKS

AMPLITUDE AND PITCH TRACKING SEGUIMIENTO DE LA AMPLITUD Y DEL PITCH

Tracking the amplitude of an audio signal is a relatively simple procedure but simply following the amplitude values of the waveform is unlikely to be useful. An audio waveform will be bipolar, expressing both positive and negative values, so to start with, some sort of rectifying of the negative part of the signal will be required. The most common method of achieving this is to square it (raise to the power of 2) and then to take the square root. Squaring any negative values will provide positive results (-2 squared equals 4). Taking the square root will restore the absolute values.

El seguimiento de la amplitud de una señal de audio es un procedimiento relativamente simple, pero simplemente seguir los valores de amplitud de la forma de onda es poco probable que sea útil. Una forma de onda de audio será bipolar, expresando valores positivos y negativos, por lo que para empezar, se requerirá algún tipo de rectificación de la parte negativa de la señal. El método más común para lograr esto es cuadrar (aumentar a la potencia de 2) y luego tomar la raíz cuadrada. La cuadratura de cualquier valor negativo proporcionará resultados positivos (-2 cuadrados es igual a 4). Tomar la raíz cuadrada restaurará los valores absolutos.

An audio signal is an oscillating signal, periodically passing through amplitude zero but these zero amplitudes do not necessarily imply that the signal has decayed to silence as our brain perceives it. Some sort of averaging will be required so that a tracked amplitude of close to zero will only be output when the signal has settled close to zero for some time. Sampling a set of values and outputting their mean will produce a more acceptable sequence of values over time for a signal's change in amplitude. Sample group size will be important: too small a sample group may result in some residual ripple in the output signal, particularly in signals with only low frequency content, whereas too large a group may result in a sluggish response to sudden changes in amplitude. Some judgement and compromise is required.

Una señal de audio es una señal oscilante, que pasa periódicamente a través de la amplitud cero, pero estas amplitudes cero no implican necesariamente que la señal ha decaído para silenciar como nuestro cerebro lo percibe. Se requerirá algún tipo de promediación para que una amplitud de seguimiento cercana a cero sólo se emita cuando la señal se ha establecido cerca de cero durante algún tiempo. El muestreo de un conjunto de valores y la salida de su media producirá una secuencia de valores más aceptable en el tiempo para un cambio de la amplitud de las señales. El tamaño del grupo de muestra será importante: un grupo de muestras demasiado pequeño puede dar lugar a una ondulación residual en la señal de salida, en particular en señales con sólo contenido de baja frecuencia, mientras que un grupo demasiado grande puede resultar en una respuesta lenta a cambios bruscos de amplitud. Se requiere un cierto juicio y compromiso.

The procedure described above is implemented in the following example. A simple audio note is created that ramps up and down according to a linseg envelope. In order to track its amplitude, audio values are converted to k-rate values and are then squared, then square rooted and then written into sequential locations of an array 31 values long. The mean is calculated by summing all values in the array and divided by the length of the array. This procedure is repeated every k-cycle. The length of the array will be critical in fine tuning the response for the reasons described in the preceding paragraph. Control rate (kr) will also be a factor therefore is taken into consideration when calculating the size of the array. Changing control rate (kr) or number of audio samples in a control period (ksmps) will then no longer alter response behaviour.

El procedimiento descrito anteriormente se implementa en el siguiente ejemplo. Se crea una nota de audio simple que sube y baja de acuerdo con un sobre de linseg. Con el fin de realizar un seguimiento de su amplitud, los valores de audio se convierten en valores de k-velocidad y luego se cuadran, luego raíz cuadrada y luego escritos en ubicaciones secuenciales de una matriz de 31 valores de largo. La media se calcula sumando todos los valores en la matriz y dividido por la longitud de la matriz. Este procedimiento se repite cada ciclo k. La longitud de la matriz será crítica en el ajuste fino de la respuesta por las razones descritas en el párrafo anterior. La velocidad de control (kr) también será un factor por lo que se tiene en cuenta al calcular el tamaño de la matriz. Cambiar la velocidad de control (kr) o el número de muestras de audio en un período de control (ksmps) ya no alterará el comportamiento de la respuesta.

EXAMPLE 05L01_Amplitude_Tracking_First_Principles.csd

<CsoundSynthesizer>

<CsOptions>
-dm0 -odac
</CsOptions>

<CsInstruments>

sr = 44100
ksmps = 16
nchnls = 1
0dbfs = 1

; a rich waveform
giwave ftgen 1,0, 512, 10, 1,1/2,1/3,1/4,1/5

instr   1
 ; create an audio signal
 aenv    linseg     0,p3/2,1,p3/2,0  ; triangle shaped envelope
 aSig    poscil     aenv,300,giwave  ; audio oscillator
         out        aSig             ; send audio to output

 ; track amplitude
 kArr[]   init  500 / ksmps     ; initialise an array
 kNdx     init  0               ; initialise index for writing to array
 kSig     downsamp        aSig  ; create k-rate version of audio signal
 kSq      =     kSig ^ 2        ; square it (negatives become positive)
 kRoot    =     kSq ^ 0.5       ; square root it (restore absolute values)
 kArr[kNdx] =   kRoot           ; write result to array
 kMean      =   sumarray(kArr) / lenarray(kArr) ; calculate mean of array
                printk  0.1,kMean   ; print mean to console
; increment index and wrap-around if end of the array is met
 kNdx           wrap    kNdx+1, 0, lenarray(kArr)
endin

</CsInstruments>

<CsScore>
i 1 0 5
</CsScore>

</CsoundSynthesizer>

In practice it is not necessary for us to build our own amplitude tracker as Csound already offers several opcodes for the task. rms outputs a k-rate amplitude tracking signal by employing mathematics similar to those described above. follow outputs at a-rate and uses a sample and hold method as it outputs data, probably necessitating some sort of low-pass filtering of the output signal. follow2 also outputs at a-rate but smooths the output signal by different amounts depending on whether the amplitude is rising or falling.

En la práctica no es necesario para nosotros construir nuestro propio tracker de amplitud ya que Csound ya ofrece varios opcodes para la tarea. Rms emite una señal de seguimiento de amplitud k-rate empleando matemáticas similares a las descritas anteriormente. Siguen las salidas a una velocidad y usan un método de muestreo y retención a medida que emite datos, probablemente requiriendo algún tipo de filtrado de paso bajo de la señal de salida. Follow2 también da salida a una velocidad, pero suaviza la señal de salida en diferentes cantidades dependiendo de si la amplitud está subiendo o bajando.

A quick comparison of these three opcodes and the original method from first principles is given below:

Una comparación rápida de estos tres opcodes y el método original de los primeros principios se da a continuación:

The sound file used in all three comparisons is 'fox.wav' which can be found as part of the Csound HTML Manual download. This sound is someone saying: “the quick brown fox jumps over the lazy dog”.

El archivo de sonido utilizado en las tres comparaciones es fox.wav que se puede encontrar como parte de la descarga del manual Csound HTML. Este sonido es alguien que dice: \This sound is someone saying: “the quick brown fox jumps over the lazy dog”.

First of all by employing the the technique exemplified in example 05L01, the amplitude following signal is overlaid upon the source signal:

En primer lugar empleando la técnica ejemplificada en el ejemplo 05L01, la se~nal de seguimiento de amplitud se superpone a la se~nal fuente:

It can be observed that the amplitude tracking signal follows the amplitudes of the input signal reasonably well. A slight delay in response at sound onsets can be observed as the array of values used by the averaging mechanism fills with appropriately high values. As discussed earlier, reducing the size of the array will improve response at the risk of introducing ripple. Another approach to dealing with the issue of ripple is to low-pass filter the signal output by the amplitude follower. This is an approach employed by the follow2 opcode. The second thing that is apparent is that the amplitude following signal does not attain the peak value of the input signal. At its peaks, the amplitude following signal is roughly 1/3 of the absolute peak value of the input signal. How close it gets to the absolute peak amplitude depends somewhat on the dynamic nature of the input signal. If an input signal sustains a peak amplitude for some time then the amplitude following signal will tend to this peak value.

Se puede observar que la señal de seguimiento de amplitud sigue razonablemente bien las amplitudes de la señal de entrada. Se puede observar un ligero retraso en la respuesta en los ensayos de sonido cuando la matriz de valores utilizada por el mecanismo de promediado se llena con valores apropiadamente altos. Como se discutió anteriormente, la reducción del tamaño de la matriz mejorará la respuesta con el riesgo de introducir ondulación. Otra aproximación para tratar la cuestión de la ondulación es el filtro de paso bajo la salida de señal por el seguidor de amplitud. Este es un enfoque empleado por el código de operación follow2. La segunda cosa que es evidente es que la señal de amplitud siguiente no alcanza el valor de pico de la señal de entrada. En sus picos, la señal de amplitud siguiente es aproximadamente 1/3 del valor de pico absoluto de la señal de entrada. La proximidad de la amplitud de pico absoluto depende en cierta medida de la naturaleza dinámica de la señal de entrada. Si una señal de entrada sostiene una amplitud de pico durante algún tiempo, entonces la señal de amplitud siguiente tenderá a este valor de pico.

The rms opcode employs a method similar to that used in the previous example but with the convenience of an encapsulated opcode. Its output superimposed upon the waveform is shown below:

El opcode rms emplea un método similar al utilizado en el ejemplo anterior pero con la conveniencia de un código de operación encapsulado. Su salida superpuesta a la forma de onda se muestra a continuación:

Its method of averaging uses filtering rather than simply taking a mean of a buffer of amplitude values. rms allows us to set the cutoff frequency (kCf) of its internal filter:

Su método de promediación utiliza el filtrado en lugar de simplemente tomar una media de un buffer de valores de amplitud. Rms nos permite establecer la frecuencia de corte (kCf) de su filtro interno:

kRms rms aSig, kCf

This is an optional argument which defaults to 10. Lowering this value will dampen changes in rms and smooth out ripple, raising it will improve the response but increase the audibility of ripple. A choice can be made based on some foreknowledge of the input audio signal: dynamic percussive input audio might demand faster response whereas audio that dynamically evolves gradually might demand greater smoothing.

Este es un argumento opcional cuyo valor por defecto es 10. Bajar este valor amortiguará los cambios en rms y suavizará la ondulación, elevarla mejorará la respuesta pero aumentará la audibilidad de la ondulación. Una elección puede hacerse basándose en algún conocimiento previo de la señal de audio de entrada: el audio de entrada de percusión dinámica puede exigir una respuesta más rápida, mientras que el audio que evoluciona dinámicamente gradualmente puede exigir un mayor suavizado.

The follow opcode uses a sample-and-hold mechanism when outputting the tracked amplitude. This can result in a stepped output that might require addition lowpass filtering before use. We actually defined the period, the duration for which values are held, using its second input argument. The update rate will be one over the period. In the following example the audio is amplitude tracked using the following line:

El opcode de seguimiento utiliza un mecanismo de muestreo y retención cuando se emite la amplitud de seguimiento. Esto puede resultar en una salida escalonada que podría requerir un filtro de paso bajo de adición antes del uso. De hecho, definimos el período, la duración para la cual los valores se mantienen, utilizando su segundo argumento de entrada. La tasa de actualización será una durante el período. En el siguiente ejemplo, el audio se controla mediante la siguiente línea:

aRms follow aSig, 0.01

with the following result:

Con el siguiente resultado:

The hump over the word spoken during the third and fourth time divisions initially seem erroneous but it is a result of greater amplitude excursion into the negative domain. follow provides a better reflection of absolute peak amplitude.

La joroba sobre la palabra hablada durante la tercera y cuarta divisiones de tiempo inicialmente parece errónea, pero es el resultado de una mayor excursión de amplitud en el dominio negativo. Proporciona una mejor reflexión de la amplitud de pico absoluta.

follow2 uses a different algorithm with smoothing on both upward and downward slopes of the tracked amplitude. We can define different values for attack and decay time. In the following example the decay time is much longer than the attack time. The relevant line of code is:

Follow2 utiliza un algoritmo diferente con suavizado en pendientes hacia arriba y hacia abajo de la amplitud de seguimiento. Podemos definir diferentes valores para el tiempo de ataque y decadencia. En el ejemplo siguiente, el tiempo de decaimiento es mucho más largo que el tiempo de ataque. La línea de código relevante es:

iAtt = 0.04

iRel  =        0.5
aTrk  follow2  aSig, 0.04, 0.5

and the result of amplitude tracking is:

Y el resultado del seguimiento de amplitud es:

This technique can be used to extend the duration of short input sound events or triggers. Note that the attack and release times for follow2 can also be modulated at k-rate.

Esta técnica se puede utilizar para ampliar la duración de los eventos de sonido de entrada corta o disparadores. Tenga en cuenta que los tiempos de ataque y liberación para follow2 también se pueden modular en k-rate.

Dynamic Gating and Amplitude Triggering

Gating dinámico y amplitud de disparo

Once we have traced the changing amplitude of an audio signal it is straightforward to use specific changes in that function to trigger other events within Csound. The simplest technique would be to simply define a threshold above which one thing happens and below which something else happens. A crude dynamic gating of the signal above could be implemented thus:

Una vez que hemos rastreado la amplitud cambiante de una señal de audio, es fácil utilizar cambios específicos en esa función para activar otros eventos dentro de Csound. La técnica más simple sería simplemente definir un umbral por encima del cual sucede una cosa y por debajo de la cual algo más sucede. Un cruce dinámico crudo de la señal anterior podría ser implementado así:

EXAMPLE 05L02_Simple_Dynamic_Gate.csd

<CsoundSynthesizer>

<CsOptions>  
-dm0 -odac
</CsOptions>

<CsInstruments>

ksmps = 32
0dbfs = 1 
; this is a necessary definition,
;         otherwise amplitude will be -32768 to 32767

instr    1
 aSig    diskin  "fox.wav", 1        ; read sound file
 kRms    rms     aSig                ; scan rms
 iThreshold =    0.1                 ; rms threshold
 kGate   =       kRms > iThreshold ? 1 : 0  ; gate either 1 or zero
 aGate   interp  kGate   ; interpolate to create smoother on->off->on switching
 aSig    =       aSig * aGate        ; multiply signal by gate
         out     aSig                ; send to output
endin

</CsInstruments>

<CsScore>
i 1 0 10
</CsScore>

</CsoundSynthesizer>

Once a dynamic threshold has been defined, in this case 0.1, the RMS value is interrogated every k-cycle as to whether it is above or below this value. If it is above, then the variable kGate adopts a value of '1' (open) or if below, kGate is zero (closed). This on/off switch could just be multiplied to the audio signal to turn it on or off according to the status of the gate but clicks would manifest each time the gates opens or closes so some sort of smoothing or ramping of the gate signal is required. In this example I have simply interpolated it using the 'interp' opcode to create an a-rate signal which is then multiplied to the original audio signal. This means that a linear ramp with be added across the duration of a k-cycle in audio samples – in this case 32 samples. A more elaborate approach might involve portamento and low-pass filtering. The results of this dynamic gate are shown below:

Una vez que se ha definido un umbral dinámico, en este caso 0,1, el valor RMS es interrogado cada ciclo k en cuanto a si está por encima o por debajo de este valor. Si está arriba, entonces la variable kGate adopta un valor de 1 (abierto) o si está por debajo, kGate es cero (cerrado). Este interruptor de encendido / apagado sólo podría multiplicarse a la señal de audio para encender o apagar según el estado de la puerta, pero los clics se manifiestan cada vez que las puertas se abre o se cierra por lo que algún tipo de suavizado o ramping de la señal de la puerta se requiere . En este ejemplo, simplemente lo he interpolado usando el código de operación interp para crear una señal a-rate que se multiplica a la señal de audio original. Esto significa que se añade una rampa lineal con la duración de un ciclo k en muestras de audio - en este caso 32 muestras. Un enfoque más elaborado podría implicar el portamento y el filtrado de paso bajo. Los resultados de esta puerta dinámica se muestran a continuación:

The threshold is depicted as a red line. It can be seen that each time the RMS value (the black line) drops below the threshold the audio signal (blue waveform) is muted.

El umbral se representa como una línea roja. Se puede ver que cada vez que el valor RMS (la línea negra) cae por debajo del umbral, la señal de audio (forma de onda azul) se silencia.

The simple solution described above can prove adequate in applications where the user wishes to sense sound event onsets and convert them to triggers but in more complex situations, in particular when a new sound event occurs whilst the previous event is still sounding and pushing the RMS above the threshold, this mechanism will fail. In these cases triggering needs to depend upon dynamic change rather than absolute RMS values. If we consider a two-event sound file where two notes sound on a piano, the second note sounding while the first is still decaying, triggers generated using the RMS threshold mechanism from the previous example will only sense the first note onset. (In the diagram below this sole trigger is illustrated by the vertical black line.) Raising the threshold might seem to be remedial action but is not ideal as this will prevent quietly played notes from generating triggers.

La solución simple descrita anteriormente puede resultar adecuada en aplicaciones en las que el usuario desea detectar los eventos de eventos sonoros y convertirlos en activadores, pero en situaciones más complejas, en particular cuando un nuevo evento de sonido se produce mientras el evento anterior sigue sonando y empujando el RMS anterior El umbral, este mecanismo fallará. En estos casos, la activación debe depender de cambios dinámicos en lugar de valores RMS absolutos. Si consideramos un archivo de sonido de dos eventos en el que dos notas sonan en un piano, la segunda nota sonando mientras la primera está todavía en decadencia, los disparadores generados usando el mecanismo de umbral de RMS del ejemplo anterior sólo detectarán el inicio de la primera nota. (En el diagrama de abajo, este único disparador se ilustra con la línea negra vertical). Elevar el umbral puede parecer una acción correctiva, pero no es ideal, ya que esto evitará que las notas reproducidas en silencio generen disparadores.

It will often be more successful to use magnitudes of amplitude increase to decide whether to generate a trigger or not. The two critical values in implementing such a mechanism are the time across which a change will be judged ('iSampTim' in the example) and the amount of amplitude increase that will be required to generate a trigger (iThresh). An additional mechanism to prevent double triggerings if an amplitude continues to increase beyond the time span of a single sample period will also be necessary. What this mechanism will do is to bypass the amplitude change interrogation code for a user-definable time period immediately after a trigger has been generated (iWait). A timer which counts elapsed audio samples (kTimer) is used to time how long to wait before retesting amplitude changes.

A menudo será más exitoso utilizar magnitudes de aumento de amplitud para decidir si se genera un gatillo o no. Los dos valores críticos en la implementación de dicho mecanismo son el tiempo a través del cual se juzgará un cambio (iSampTim en el ejemplo) y la cantidad de aumento de amplitud que se requerirá para generar un disparo (iThresh). También será necesario un mecanismo adicional para evitar disparos dobles si una amplitud continúa aumentando más allá del tiempo de un solo período de muestra. Lo que este mecanismo hará es pasar por alto el código de interrogación de cambio de amplitud para un período de tiempo definido por el usuario inmediatamente después de que se haya generado un disparador (iWait). Se utiliza un temporizador que cuenta las muestras de audio transcurridas (kTimer) para medir el tiempo de espera antes de que cambie la amplitud de la prueba.

If we pass our piano sound file through this instrument, the results look like this:

Si pasamos nuestro archivo de sonido de piano a través de este instrumento, los resultados se ven así:

This time we correctly receive two triggers, one at the onset of each note. The example below tracks audio from the sound-card input channel 1 using this mechanism.

Esta vez recibimos correctamente dos disparadores, uno al inicio de cada nota. El siguiente ejemplo reproduce el audio del canal de entrada 1 de la tarjeta de sonido utilizando este mecanismo.

EXAMPLE 05L03_Dynamic_Trigger.csd

<CsoundSynthesizer>

<CsOptions>  
-dm0 -iadc -odac
</CsOptions>

<CsInstruments>

sr     =  44100
ksmps  =  32
nchnls =  2
0dbfs  =  1

instr   1
 iThresh  =       0.1                ; change threshold
 aSig     inch    1                  ; live audio in
 iWait    =       1000              ; prevent repeats wait time (in samples)
 kTimer   init    1001               ; initial timer value
 kRms     rms     aSig, 20           ; track amplitude
 iSampTim =       0.01    ; time across which change in RMS will be measured
 kRmsPrev delayk  kRms, iSampTim     ; delayed RMS (previous)
 kChange  =       kRms - kRmsPrev    ; change
 if(kTimer>iWait) then               ; if we are beyond the wait time...
  kTrig   =       kChange > iThresh ? 1 : 0 ; trigger if threshold exceeded
  kTimer  =       kTrig == 1 ? 0 : kTimer ; reset timer when a trigger generated
 else                     ; otherwise (we are within the wait time buffer)
  kTimer  +=      ksmps              ; increment timer
  kTrig   =       0                  ; cancel trigger
 endif
          schedkwhen kTrig,0,0,2,0,0.1 ; trigger a note event
endin

instr   2
 aEnv     transeg   0.2, p3, -4, 0     ; decay envelope
 aSig     poscil    aEnv, 400          ; 'ping' sound indicator
          out       aSig               ; send audio to output
endin

</CsInstruments>

<CsScore>
i 1 0 [3600*24*7]
</CsScore>
</CsoundSynthesizer>

Pitch Tracking

Csound currently provides five opcode options for pitch tracking. In ascending order of newness they are: pitch, pitchamdf, pvspitch, ptrack and plltrack. Related to these opcodes are pvscent and centroid but rather than track the harmonic fundamental, they track the spectral centroid of a signal. An example and suggested application for centroid is given a little later on in this chapter.

Csound actualmente ofrece cinco opciones de opcode para el seguimiento de tono. En orden ascendente de novedad son: pitch, pitchamdf, pvspitch, ptrack y plltrack. Relacionados con estos opcodes son pvscent y centroide, pero en lugar de rastrear la fundamental armónica, rastrean el centroide espectral de una señal. Un ejemplo y la aplicación sugerida para el centroide se da un poco más adelante en este capítulo.

Each offers a slightly different set of features – some offer simultaneous tracking of both amplitude and pitch, some only pitch tracking. None of these opcodes provide more than one output for tracked frequency therefore none offer polyphonic tracking although in a polyphonic tone the fundamental of the strongest tone will most likely be tracked. Pitch tracking presents many more challenges than amplitude tracking therefore a degree of error can be expected and will be an issue that demands addressing. To get the best from any pitch tracker it is important to consider preparation of the input signal – either through gating or filtering – and also processing of the output tracking data, for example smoothing changes through the use of filtering opcode such as port, median filtering to remove erratic and erroneous data and a filter to simply ignore obviously incorrect data.

Cada uno ofrece un conjunto ligeramente diferente de características - algunas ofrecen un seguimiento simultáneo de la amplitud y el tono, algunos seguimiento solo. Ninguno de estos opcodes proporciona más de una salida para la frecuencia seguida, por lo tanto, ninguno ofrece un seguimiento polifónico, aunque en un tono polifónico es muy probable que se rastree el fundamental del tono más fuerte. El seguimiento del tono presenta muchos más desafíos que el seguimiento de la amplitud, por lo tanto, se puede esperar un grado de error y será un problema que exige abordar. Para obtener lo mejor de cualquier rastreador de tono, es importante considerar la preparación de la señal de entrada, ya sea a través de gating o filtrado, y también el procesamiento de los datos de seguimiento de salida, por ejemplo, los cambios de suavizado mediante el uso de opcode de filtrado, Para eliminar datos erráticos y erróneos y un filtro simplemente ignorar datos obviamente incorrectos.

Parameters for these procedures will rely upon some prior knowledge of the input signal, the pitch range of an instrument for instance. A particularly noisy environment or a distant microphone placement might demand more aggressive noise gating. In general some low-pass filtering of the input signal will always help in providing a more stable frequency tracking signal. Something worth considering is that the attack portion of a note played on an acoustic instrument generally contains a lot of noisy, harmonically chaotic material. This will tend to result in slightly chaotic movement in the pitch tracking signal, we may therefore wish to sense the onset of a note and only begin tracking pitch once the sustain portion has begun. This may be around 0.05 seconds after the note has begun but will vary from instrument to instrument and from note to note. In general lower notes will have a longer attack. However we do not really want to overestimate the duration of this attack stage as this will result in a sluggish pitch tracker. Another specialised situation is the tracking of pitch in singing – we may want to gate sibilant elements ('sss', 't' etc.). pvscent can be useful in detecting the difference between vowels and sibilants.

Los parámetros para estos procedimientos se basarán en algún conocimiento previo de la señal de entrada, el rango de tono de un instrumento, por ejemplo. Un entorno particularmente ruidoso o una colocación de micrófono distante podría exigir un gating más agresivo del ruido. En general, algunos filtros de paso bajo de la señal de entrada siempre ayudarán a proporcionar una señal de seguimiento de frecuencia más estable. Algo que vale la pena considerar es que la porción de ataque de una nota tocada en un instrumento acústico generalmente contiene un montón de material ruidoso, armónicamente caótico. Esto tenderá a dar lugar a un movimiento ligeramente caótico en la señal de seguimiento de tono, por lo tanto, puede que desee detectar el inicio de una nota y sólo comenzar el paso de seguimiento una vez que la parte de sostén ha comenzado. Esto puede ser alrededor de 0,05 segundos después de que la nota haya comenzado, pero variará de instrumento a instrumento y de nota a nota. En general, las notas inferiores tendrán un ataque más largo. Sin embargo, realmente no queremos sobrestimar la duración de esta etapa de ataque, ya que esto dará lugar a un rastreador pitch lento. Otra situación especializada es el seguimiento del tono en el canto - es posible que desee a la puerta elementos sibilantes (sss, t, etc). Pvscent puede ser útil en la detección de la diferencia entre vocales y sibilantes.

'pitch' is the oldest of the pitch tracking opcodes on offer and provides the widest range of input parameters.

Pitch es el más antiguo de los opcodes de seguimiento de tono ofrecidos y ofrece la más amplia gama de parámetros de entrada.

koct, kamp pitch asig, iupdte, ilo, ihi, idbthresh [, ifrqs] [, iconf] \

      [, istrt] [, iocts] [, iq] [, inptls] [, irolloff] [, iskip]

This makes it somewhat more awkward to use initially (although many of its input parameters are optional) but some of its options facilitate quite specialised effects. Firstly it outputs its tracking signal in 'oct' format. This might prove to be a useful format but conversion to other formats is easy anyway. Apart from a number of parameters intended to fine tune the production of an accurate signal it allows us to specify the number of octave divisions used in quantising the output. For example if we give this a value of 12 we have created the basis of a simple chromatic 'autotune' device. We can also quantise the procedure in the time domain using its 'update period' input. Material with quickly changing pitch or vibrato will require a shorter update period (which will demand more from the CPU). It has an input control for 'threshold of detection' which can be used to filter out and disregard pitch and amplitude tracking data beneath this limit. Pitch is capable of very good pitch and amplitude tracking results in real-time.

Esto hace que sea un poco más difícil de usar inicialmente (aunque muchos de sus parámetros de entrada son opcionales), pero algunas de sus opciones facilitan efectos bastante especializados. En primer lugar, emite su señal de seguimiento en formato oct. Esto podría resultar ser un formato útil, pero la conversión a otros formatos es fácil de todos modos. Aparte de una serie de parámetros destinados a afinar la producción de una señal precisa que nos permite especificar el número de octavas divisiones utilizadas en la cuantificación de la salida. Por ejemplo, si le damos a este un valor de 12, hemos creado la base de un dispositivo cromático autotune simple. También podemos cuantificar el procedimiento en el dominio de tiempo usando su entrada de período de actualización. El material con cambio rápido de tono o vibrato requerirá un período de actualización más corto (que requerirá más de la CPU). Tiene un control de entrada para el umbral de detección que puede usarse para filtrar y despreciar los datos de seguimiento de amplitud y tono por debajo de este límite. Pitch es capaz de muy buen pitch y amplitud de seguimiento de resultados en tiempo real.

pitchamdf uses the so-called 'Average Magnitude Difference Function' method. It is perhaps slightly more accurate than pitch as a general purpose pitch tracker but its CPU demand is higher.

Pitchamdf utiliza el denominado método de Diferencia de Magnitud Promedio. Es tal vez un poco más preciso que el tono como un rastreador de paso de propósito general, pero su demanda de CPU es mayor.

pvspitch uses streaming FFT technology to track pitch. It takes an f-signal as input which will have to be created using the pvsanal opcode. At this step the choice of FFT size will have a bearing upon the performance of the pvspitch pitch tracker. Smaller FFT sizes will allow for faster tracking but with perhaps some inaccuracies, particularly with lower pitches whereas larger FFT sizes are likely to provide for more accurate pitch tracking at the expense of some time resolution. pvspitch tries to mimic certain functions of the human ear in how it tries to discern pitch. pvspitch works well in real-time but it does have a tendency to jump its output to the wrong octave – an octave too high – particularly when encountering vibrato.

Pvspitch utiliza la tecnología FFT de transmisión para rastrear el tono. Se necesita una señal f como entrada que tendrá que ser creado usando el código de operación pvsanal. En este paso, la elección del tamaño de la FFT tendrá un efecto sobre el rendimiento del pvspitch pitch tracker. Los tamaños FFT más pequeños permitirán un rastreo más rápido, pero con quizás algunas imprecisiones, particularmente con niveles más bajos, mientras que tamaños FFT más grandes son propensos a proporcionar un seguimiento de tono más preciso a expensas de alguna resolución temporal. Pvspitch trata de imitar ciertas funciones del oído humano en cómo trata de discernir el tono. Pvspitch funciona bien en tiempo real, pero tiene una tendencia a saltar su salida a la octava equivocada - una octava demasiado alta - particularmente al encontrar vibrato.

ptrack also makes uses of streaming FFT but takes an normal audio signal as input, performing the FFT analysis internally. We still have to provide a value for FFT size with the same considerations mentioned above. ptrack is based on an algorithm by Miller Puckette, the co-creator of MaxMSP and creator of PD. ptrack also works well in real-time but it does have a tendency to jump to erroneous pitch tracking values when pitch is changing quickly or when encountering vibrato. Median filtering (using the mediank opcode) and filtering of outlying values might improve the results.

Ptrack también hace uso de streaming FFT pero toma una señal de audio normal como entrada, realizando el análisis de FFT internamente. Todavía tenemos que proporcionar un valor para el tamaño FFT con las mismas consideraciones mencionadas anteriormente. Ptrack se basa en un algoritmo de Miller Puckette, el co-creador de MaxMSP y creador de PD. Ptrack también funciona bien en tiempo real, pero tiene una tendencia a saltar a valores erróneos de seguimiento de tono cuando el tono está cambiando rápidamente o cuando se encuentra con vibrato. El filtrado mediano (utilizando el opcode mediank) y el filtrado de valores periféricos podrían mejorar los resultados.

plltrack uses a phase-locked loop algorithm in detecting pitch. plltrack is another efficient real-time option for pitch tracking. It has a tendency to gliss up and down from very low frequency values at the start and end of notes, i.e. when encountering silence. This effect can be minimised by increasing its 'feedback' parameter but this can also make pitch tracking unstable over sustained notes.

Plltrack utiliza un algoritmo de bucle de fase bloqueada en la detección de tono. Plltrack es otra opción en tiempo real eficiente para el seguimiento del tono. Tiene una tendencia a deslizar hacia arriba y hacia abajo desde valores de frecuencia muy baja al comienzo y al final de las notas, es decir, cuando se encuentra con el silencio. Este efecto se puede minimizar aumentando su parámetro de realimentación, pero esto también puede hacer que el seguimiento de tono sea inestable sobre notas sostenidas.

In conclusion, pitch is probably still the best choice as a general purpose pitch tracker, pitchamdf is also a good choice. pvspitch, ptrack and plltrack all work well in real-time but might demand additional processing to remove errors.

En conclusión, el tono es probablemente todavía la mejor opción como un rastreador de pitch de propósito general, pitchamdf es también una buena opción. Pvspitch, ptrack y plltrack funcionan bien en tiempo real pero pueden requerir procesamiento adicional para eliminar errores.

pvscent and centroid are a little different to the other pitch trackers in that, rather than try to discern the fundemental of a harmonic tone, they assess what the centre of gravity of a spectrum is. An application for this is in the identification of different instruments playing the same note. Softer, darker instruments, such as the french horn, will be characterised by a lower centroid to that of more shrill instruments, such as the violin.

Pvscent y centroide son un poco diferentes a los otros pitch trackers en que, en lugar de tratar de discernir el fundemental de un tono armónico, que evaluar lo que el centro de gravedad de un espectro es. Una aplicación para esto se encuentra en la identificación de diferentes instrumentos tocando la misma nota. Instrumentos más suaves y oscuros, como el cuerno francés, se caracterizarán por un centroide inferior al de instrumentos más agudos, como el violín.

Both opcodes use FFT. Centroid works directly with an audio signal input whereas pvscent requires an f-sig input. Centroid also features a trigger input which allows us to manually trigger it to update its output. In the following example we use centroid to detect individual drums sounds – bass drum, snare drum, cymbal – within a drum loop. We will use the dynamic amplitude trigger from earlier on in this chapter to detect when sound onsets are occurring and use this trigger to activate centroid and also then to trigger another instrument with a replacement sound.

Ambos opcodes utilizan FFT. Centroid trabaja directamente con una entrada de señal de audio, mientras que pvscent requiere una entrada f-sig. Centroid también cuenta con una entrada de disparo que nos permite activarla manualmente para actualizar su salida. En el siguiente ejemplo usamos el centroide para detectar sonidos de tambores individuales - bombo, caja, címbalo - dentro de un bucle de batería. Utilizaremos el disparador de amplitud dinámica desde el principio de este capítulo para detectar cuándo se producen los sonidos y utilizar este disparador para activar el centroide y también para activar otro instrumento con un sonido de reemplazo.

Each percussion instrument in the original drum loop will be replaced with a different sound: bass drums will be replaced with a kalimba/thumb piano sound, snare drums will be replaced by hand claps (a la TR-808), and cymbal sounds will be replaced with tambourine sounds. The drum loop used is beats.wav which can be found with the download of the Csound HTML manual (and within the Csound download itself). This loop is not ideal as some of the instruments coincide with one another – for example, the first consists of a bass drum and a snare drum played together. The 'beat replacer' will inevitably make a decision one way or the other but is not advanced enough to detect both instruments playing simultaneously. The critical stage is the series of if... elseifs... at the bottom of instrument 1 where decisions are made about instruments' identities according to what centroid band they fall into. The user can fine tune the boundary division values to modify the decision making process. centroid values are also printed to the terminal when onsets are detected which might assist in this fine tuning.

Cada instrumento de percusión en el bucle de batería original será reemplazado por un sonido diferente: los bajos serán reemplazados por un sonido de piano kalimba / pulgar, los tambores serán reemplazados por aplausos manuales (a la TR-808), y los sonidos del platillo serán Reemplazado con sonidos de pandereta. El bucle de percusión utilizado es beats.wav que se puede encontrar con la descarga del manual Csound HTML (y dentro de la descarga de Csound). Este bucle no es ideal ya que algunos de los instrumentos coinciden entre sí - por ejemplo, el primero consiste en un bombo y un tambor de trampa que se tocan juntos. El sustituto del golpe inevitablemente tomará una decisión de una manera u otra, pero no está lo suficientemente avanzado como para detectar ambos instrumentos tocando simultáneamente. La etapa crítica es la serie de if ... elseifs ... en la parte inferior del instrumento 1 donde se toman decisiones sobre las identidades de los instrumentos de acuerdo a la banda centróide en la que caen. El usuario puede afinar los valores de división de límites para modificar el proceso de toma de decisiones. Los valores de centroide también se imprimen en el terminal cuando se detectan encendidos que podrían ayudar en esta afinación fina.

EXAMPLE 05L04_Drum_Replacement.csd

<CsoundSynthesizer>
<CsOptions>
-dm0 -odac
</CsOptions>
<CsInstruments>

sr = 44100
ksmps = 32
nchnls = 1
0dbfs = 1

instr   1
 asig   diskin  "beats.wav",1

 iThreshold = 0.05
 iWait      = 0.1*sr
 kTimer     init iWait+1
 iSampTim =       0.02                ; time across which RMS change is measured
 kRms   rms     asig ,20
 kRmsPrev       delayk  kRms,iSampTim ; rms from earlier
 kChange =      kRms - kRmsPrev       ; change (+ve or -ve)

 if kTimer > iWait then               ; prevent double triggerings
  ; generate a trigger
  kTrigger   =  kChange > iThreshold ? 1 : 0
  ; if trigger is generated, reset timer
  kTimer  =   kTrigger == 1 ? 0 : kTimer
 else
  kTimer  +=  ksmps                   ; increment timer
  kTrigger = 0                        ; clear trigger
 endif

 ifftsize = 1024
 ; centroid triggered 0.02 after sound onset to avoid noisy attack
 kDelTrig delayk kTrigger,0.02
 kcent  centroid asig, kDelTrig, ifftsize  ; scan centroid
        printk2  kcent            ; print centroid values
 if kDelTrig==1 then
  if kcent>0 && kcent<2500 then   ; first freq. band
   event "i","Cowbell",0,0.1
  elseif kcent<8000 then          ; second freq. band
   event "i","Clap",0,0.1
  else                            ; third freq. band
   event "i","Tambourine",0,0.5
  endif
 endif
endin

instr   Cowbell
 kenv1  transeg 1,p3*0.3,-30,0.2, p3*0.7,-30,0.2
 kenv2  expon   1,p3,0.0005
 kenv   =       kenv1*kenv2
 ipw    =       0.5
 a1     vco2    0.65,562,2,0.5
 a2     vco2    0.65,845,2,0.5
 amix   =       a1+a2
 iLPF2  =       10000
 kcf    expseg  12000,0.07,iLPF2,1,iLPF2
 alpf   butlp   amix,kcf
 abpf   reson   amix, 845, 25
 amix   dcblock2        (abpf*0.06*kenv1)+(alpf*0.5)+(amix*0.9)
 amix   buthp   amix,700
 amix   =       amix*0.5*kenv
        out     amix
endin

instr   Clap
 if frac(p1)==0 then
  event_i       "i", p1+0.1, 0,     0.02
  event_i       "i", p1+0.1, 0.01,  0.02
  event_i       "i", p1+0.1, 0.02,  0.02
  event_i       "i", p1+0.1, 0.03,  2
 else
  kenv  transeg 1,p3,-25,0
  iamp  random  0.7,1
  anoise        dust2   kenv*iamp, 8000
  iBPF          =       1100
  ibw           =       2000
  iHPF          =       1000
  iLPF          =       1
  kcf   expseg  8000,0.07,1700,1,800,2,500,1,500
  asig  butlp   anoise,kcf*iLPF
  asig  buthp   asig,iHPF
  ares  reson   asig,iBPF,ibw,1
  asig  dcblock2        (asig*0.5)+ares
        out     asig
 endif
endin

instr   Tambourine
        asig    tambourine      0.3,0.01 ,32, 0.47, 0, 2300 , 5600, 8000
                out     asig    ;SEND AUDIO TO OUTPUTS
endin

</CsInstruments>
<CsScore>
i 1 0 10
</CsScore>
</CsoundSynthesizer>