Interpreting Wav Data

January 03, 2024 Post a Comment

I'm trying to write a program to display PCM data. I've been very frustrated trying to find a library with the right level of abstraction, but I've found the python wave library an

Solution 1:

Thank you for your help! I got it working and I'll post the solution here for everyone to use in case some other poor soul needs it:

import wave
import struct

defpcm_channels(wave_file):
    """Given a file-like object or file path representing a wave file,
    decompose it into its constituent PCM data streams.

    Input: A file like object or file path
    Output: A list of lists of integers representing the PCM coded data stream channels
        and the sample rate of the channels (mixed rate channels not supported)
    """
    stream = wave.open(wave_file,"rb")

    num_channels = stream.getnchannels()
    sample_rate = stream.getframerate()
    sample_width = stream.getsampwidth()
    num_frames = stream.getnframes()

    raw_data = stream.readframes( num_frames ) # Returns byte data
    stream.close()

    total_samples = num_frames * num_channels

    if sample_width == 1: 
        fmt = "%iB" % total_samples # read unsigned charselif sample_width == 2:
        fmt = "%ih" % total_samples # read signed 2 byte shortselse:
        raise ValueError("Only supports 8 and 16 bit audio formats.")

    integer_data = struct.unpack(fmt, raw_data)
    del raw_data # Keep memory tidy (who knows how big it might be)

    channels = [ [] for time inrange(num_channels) ]

    for index, value inenumerate(integer_data):
        bucket = index % num_channels
        channels[bucket].append(value)

    return channels, sample_rate

Solution 2:

"Two channels" means stereo, so it makes no sense to sum each channel's duration -- so you're off by a factor of two (2.18 seconds, not 4.37). As for signedness, as explained for example here, and I quote:

8-bit samples are stored as unsigned bytes, ranging from 0 to 255. 16-bit samples are stored as 2's-complement signed integers, ranging from -32768 to 32767.

This is part of the specs of the WAV format (actually of its superset RIFF) and thus not dependent on what library you're using to deal with a WAV file.

Solution 3:

I know that an answer has already been accepted, but I did some things with audio a while ago and you have to unpack the wave doing something like this.

pcmdata = wave.struct.unpack("%dh"%(wavedatalength),wavedata)

Also, one package that I used was called PyAudio, though I still had to use the wave package with it.

Baca Juga

Solution 4:

Each sample is 16 bits and there 2 channels, so the frame takes 4 bytes

Solution 5:

The duration is simply the number of frames divided by the number of frames per second. From your data this is: 96333 / 44100 = 2.18 seconds.

Python Programming Language