Interpreting Wav Data
Solution 1:
Thank you for your help! I got it working and I'll post the solution here for everyone to use in case some other poor soul needs it:
import wave
import struct
defpcm_channels(wave_file):
"""Given a file-like object or file path representing a wave file,
decompose it into its constituent PCM data streams.
Input: A file like object or file path
Output: A list of lists of integers representing the PCM coded data stream channels
and the sample rate of the channels (mixed rate channels not supported)
"""
stream = wave.open(wave_file,"rb")
num_channels = stream.getnchannels()
sample_rate = stream.getframerate()
sample_width = stream.getsampwidth()
num_frames = stream.getnframes()
raw_data = stream.readframes( num_frames ) # Returns byte data
stream.close()
total_samples = num_frames * num_channels
if sample_width == 1:
fmt = "%iB" % total_samples # read unsigned charselif sample_width == 2:
fmt = "%ih" % total_samples # read signed 2 byte shortselse:
raise ValueError("Only supports 8 and 16 bit audio formats.")
integer_data = struct.unpack(fmt, raw_data)
del raw_data # Keep memory tidy (who knows how big it might be)
channels = [ [] for time inrange(num_channels) ]
for index, value inenumerate(integer_data):
bucket = index % num_channels
channels[bucket].append(value)
return channels, sample_rate
Solution 2:
"Two channels" means stereo, so it makes no sense to sum each channel's duration -- so you're off by a factor of two (2.18 seconds, not 4.37). As for signedness, as explained for example here, and I quote:
8-bit samples are stored as unsigned bytes, ranging from 0 to 255. 16-bit samples are stored as 2's-complement signed integers, ranging from -32768 to 32767.
This is part of the specs of the WAV format (actually of its superset RIFF) and thus not dependent on what library you're using to deal with a WAV file.
Solution 3:
I know that an answer has already been accepted, but I did some things with audio a while ago and you have to unpack the wave doing something like this.
pcmdata = wave.struct.unpack("%dh"%(wavedatalength),wavedata)
Also, one package that I used was called PyAudio, though I still had to use the wave package with it.
Solution 4:
Each sample is 16 bits and there 2 channels, so the frame takes 4 bytes
Solution 5:
The duration is simply the number of frames divided by the number of frames per second. From your data this is: 96333 / 44100 = 2.18 seconds
.
Post a Comment for "Interpreting Wav Data"