Sound - for audio playback

Audio playback is handled by the Sound class. PsychoPy® currently supports a choice of sound engines: PTB, pyo, sounddevice or pygame. You can select which will be used via the audioLib preference. sound.Sound() will then refer to one of the following backends:

This preference can be set on a per-experiment basis by importing preferences, and setting the audioLib option to use. Audio playback backends vary in performance due to all sorts of factors. Based on testing done by the PsychoPy® team and reports from users, their performance can be summarized as follows:

Sounds are actually generated by a variety of classes, depending on which “backend” you use (like pyo or sounddevice) and these different backends can have slightly different attributes, as below. The user should typically do:

from psychopy.sound import Sound

The class that gets imported will then be an alias of one of the Sound Classes described below.

PTB audio latency

PTB brings a number of advantages in terms of latency.

The first is that is has been designed specifically with low-latency playback in mind (rather than, say, on-the-fly mixing and filtering capabilities). Mario Kleiner has worked very hard get the best out of the drivers available on each operating system and, as a result, with the most aggressive low-latency settings you can get a sound to play in “immediate” mode with typically in the region of 5ms lag and maybe 1ms precision. That’s pretty good compared to the other options that have a lag of 20ms upwards and several ms variability.

BUT, on top of that, PTB allows you to preschedule your sound to occur at a particular point in time (e.g. when the trigger is due to be sent or when the screen is due to flip) and the PTB engine will then prepare all the buffers ready to go and will also account for the known latencies in the card. With this method the PTB engine is capable of sub-ms precision and even sub-ms lag!

Of course, capable doesn’t mean it’s happening in your case. It can depend on many things about the local operating system and hardware. You should test it yourself for your kit, but here is an example of a standard Win10 box using built-in audio (not a fancy audio card):

../../_images/audioScope_win10_PTB_mode3.png

Fig. 58 Sub-ms audio timing with standard audio on Win10. Yellow trace is a 440 Hz tone played at 48 kHz with PTB engine. Cyan trace is the trigger (from a Labjack output). Gridlines are set to 1 ms.

Preschedule your sound

The most precise way to use the PTB audio backend is to preschedule the playing of a sound. By doing this PTB can actually take into account both the time taken to load the sound (it will preload ready) and also the time taken by the hardware to start playing it.

To do this you can call play() with an argument called when. The when argument needs to be in the PsychToolBox clock timebase which can be accessed by using psychtoolbox.GetSecs() if you want to play sound at an arbitrary time (not in sync with a window flip)

For instance:

import psychtoolbox as ptb
from psychopy import sound

mySound = sound.Sound('A')
now = ptb.GetSecs()
mySound.play(when=now+0.5)  # play in EXACTLY 0.5s

or using Window.getFutureFlipTime(clock=’ptb’) if you want a synchronized time:

import psychtoolbox as ptb
from psychopy import sound, visual

mySound = sound.Sound('A')

win = visual.Window()
win.flip()
nextFlip = win.getFutureFlipTime(clock='ptb')

mySound.play(when=nextFlip)  # sync with screen refresh

The precision of that timing is still dependent on the PTB Audio Latency Modes and can obviously not work if the delay before the requested time is not long enough for the requested mode (e.g. if you request that the sound starts on the next refresh but set the latency mode to be 0 (which has a lag of around 300 ms) then the timing will be very poor.

PTB Audio Latency Modes

When using the PTB backend you get the option to choose the Latency Mode, referred to in PsychToolBox as the reqlatencyclass, and can be set in psychopy.hardware.speaker.SpeakerDevice

PsychoPy® uses Mode 3 in as a default, assuming that you want low latency and you don’t care if other applications can’t play sound at the same time (don’t listen to iTunes while running your study!)

The modes are as follows:

0 : Latency not important

For when it really doesn’t matter. Latency can easily be in the region of 300ms! The advantage of this move is that it will always work and always play a sound, whatever the format of the existing sounds that have been played (with 2, 3, 4 you can obtain low latency but the sampling rate must be the same throughout the experiment).

1 : Share low-latency access

Tries to use a low-latency setup in combination with other applications. Latency usually isn’t very good and in MS Windows the sound you play must be the same sample rate as any other application that is using the sound system (which means you usually get restricted to exactly 48000 instead of 44100).

2 : Exclusive mode low-latency

Takes control of the audio device you’re using and dominates it. That can cause some problems for other apps if they’re trying to play sounds at the same time.

3 : Aggressive exclusive mode

As Mode 2 but with more aggressive settings to prioritise our use of the card over all others. This is the recommended mode for most studies

4 : Critical mode

As Mode 3 except that, if we fail to be totally dominant, then raise an error rather than just accepting our slightly less dominant status.

PTB Devices

To set the output audio device to use, you can set the prefs.hardware[‘audioDevice’] setting. To determine the set of available devices, you can do for example:

from pprint import pprint
import psychtoolbox.audio
pprint(psychtoolbox.audio.get_devices())

Sound Classes

PTB Sound

class psychopy.sound.backend_ptb.SoundPTB(value='C', secs=0.5, octave=4, stereo=-1, volume=1.0, loops=0, sampleRate=None, blockSize=128, preBuffer=-1, hamming=True, startTime=0, stopTime=-1, name='', autoLog=True, syncToWin=None, speaker=None)[source]

Play a variety of sounds using the new PsychPortAudio library

Parameters:
  • value – note name (“C”,”Bfl”), filename or frequency (Hz)

  • secs – duration (for synthesised tones)

  • octave – which octave to use for note names (4 is middle)

  • stereo – -1 (auto), True or False to force sounds to stereo or mono

  • volume – float 0-1

  • loops – number of loops to play (-1=forever, 0=single repeat)

  • sampleRate – sample rate for synthesized tones

  • blockSize – the size of the buffer on the sound card (small for low latency, large for stability)

  • preBuffer – integer to control streaming/buffering - -1 means store all - 0 (no buffer) means stream from disk - potentially we could buffer a few secs(!?)

  • hamming – boolean (default True) to indicate if the sound should be apodized (i.e., the onset and offset smoothly ramped up from down to zero). The function apodize uses a Hanning window, but arguments named ‘hamming’ are preserved so that existing code is not broken by the change from Hamming to Hanning internally. Not applied to sounds from files.

  • startTime – for sound files this controls the start of snippet

  • stopTime – for sound files this controls the end of snippet

  • name – string for logging purposes

  • autoLog – whether to automatically log every change

  • syncToWin – if you want start/stop to sync with win flips add this

_EOS(reset=True, log=True)[source]

Function called on End Of Stream

_channelCheck(array)[source]

Checks whether stream has fewer channels than data. If True, ValueError

_checkPlaybackFinished()[source]

Checks whether playback has finished by looking up the status.

property isFinished

True if the audio playback has completed.

property isPlaying

True if the audio playback is ongoing.

pause(log=True)[source]

Stops the sound without reset, so that play will continue from here if needed

play(loops=None, when=None, log=True)[source]

Start the sound playing.

Calling this after the sound has finished playing will restart the sound.

setSound(value, secs=0.5, octave=4, hamming=None, log=True)[source]

Set the sound to be played.

Often this is not needed by the user - it is called implicitly during initialisation.

Parameters:
value: can be a number, string or an array:
  • If it’s a number between 37 and 32767 then a tone will be generated at that frequency in Hz.

  • It could be a string for a note (‘A’, ‘Bfl’, ‘B’, ‘C’, ‘Csh’. …). Then you may want to specify which octave.

  • Or a string could represent a filename in the current location, or mediaLocation, or a full path combo

  • Or by giving an Nx2 numpy array of floats (-1:1) you can specify the sound yourself as a waveform

secs: duration (only relevant if the value is a note name or

a frequency value)

octave: is only relevant if the value is a note name.

Middle octave of a piano is 4. Most computers won’t output sounds in the bottom octave (1) and the top octave (8) is generally painful

stop(reset=True, log=True)[source]

Stop the sound and return to beginning

property stream

Read-only property returns the stream on which the sound will be played

property track

The track on the master stream to which we belong


Back to top