×
INTELLIGENT WORK FORUMS
FOR ENGINEERING PROFESSIONALS

Log In

Come Join Us!

Are you an
Engineering professional?
Join Eng-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Eng-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Audio File Question
3

Audio File Question

Audio File Question

(OP)
I just started writing a program that will graph different features of sound waves. At this point I've only writen code to graph the raw data from a uncompressed 16 bit wav file. The graph looks perfect and exactly how I would expect a sound wave to look but for some reason the zero point of the waves is above zero.

I had sort of assumed that when a computer records a 16 bit wav file that zero would represent no pressure change, positive numbers would represent when the pressure incresed above the normal pressure on the cresting side of the sound wave and negative numbers would represent when the pressure decreases below normal pressure. So I sort of expected the total of all the numbers (except for the header) in a wav file to be roughly zero.

So my question is, did I make an incorrect assumption or did I possibly program something wrong?

thank you for your help.

RE: Audio File Question

Hi Mike-

Say, wouldn't it be pretty easy to record a small WAV file
without an input, then run it through your program and
see what pops out on the graph?

At first blush, and I could be absolutely dead wrong, I see no reason for having an "AC coupled" source not have a representation of zero or 1/2 full scale.  However, and here's another area where I could be dead wrong, the modulation scheme used might be something like a "delta" modulation, where indeed, one would expect a "positive" differential of the waveform would be represented with an increasing or "positive" representation while a "negative" differential of the waveform would be represented with a negative representation.  

Since it's in a "digital" form, there are all kinds of modulation schemes, and data compression techniques that might make the data representation unclear. O.K a tad looksee via google says that .wav files are PCM modulation.
That means Pulse Code Modulation.

Here's a pretty good little link that I found during the google search that might help explain it.  Search for the data string "pcm" and you should run across it in the section titled:
"Sample Points and Sample Frames"

The link is:

http://www.borg.com/~jglatt/tech/wave.htm

Hope that this helps.

  Cheers,

   Rich S.

RE: Audio File Question

Maybe you picked up on the effects of a biased microphone.

RE: Audio File Question

Hi Mike,

I have been dabbling at writing some programs to analyze WAV files too, so it's gratifying to see a post from someone with the same interest. Here is a quote from a site I used as a reference:

Quote:

8-bit samples are stored as unsigned bytes, ranging from 0 to 255. 16-bit samples are stored as 2's-complement signed integers, ranging from -32768 to 32767

The site URL is http://ccrma.stanford.edu/CCRMA/Courses/422/projects/WaveFormat/

So it would appear your assumption is correct, in 16 bit WAV files the samples are positive and negative. You don't mention if the files you are graphing are mono or stereo. Could this be a factor? I can't speak authoritatively about different sounds cards, but my impression is that most use 10 or 12 bit A/D converters so the samples are not really 16 bits. The sound card driver would determine whether the samples are sign-extended into 16 bits, so this could have a bearing on the numbers. I found it helpful to use a program to dump the contents of files to the display in hexadecimal so I can see the actual numbers in the DATA chunks of WAV files. Hope this helps.

Good Luck,
Greg Hansen

RE: Audio File Question

(OP)
It turns out a made a small mistake in how I wrote the program. Too small to produce an obviously wrong result but enough to knock my zero point off. I appreciate everybody's help on this.

Greg it is nice too see your interested in the same thing. What type of application are you writing a sound analyzing program for?

thanks

Mike

RE: Audio File Question

Hi Mike,

Thanks for the reply post, and your interest. Besides enjoying(?) the challenge of Windows programming, I am an amateur musician. I also have an interest in digital signal processing. Combining these interests (programming, music and DSP) has led me to try to develop an application that would help me figure out the notes being played in a given piece of music, using the raw WAV file data. This is not an especially original objective, and the project has been dragging on for a long time. I have read some material on this type of analysis suggesting that the naive approach I am using, which is to simply use a discrete Fourier transform of the the sampled sound, is useless. Nonetheless, I figured I would start with that objective and see if leads anywhere. And it is a useful learning project, with modest enough goals so that I can maintain (sort of) my motivation to keep at it.

If you don't mind me asking, what is your application?

Good Luck,
Greg Hansen

RE: Audio File Question

To MikeMM and Gregha04756,
here is another amateur musician that has been trying to convert a WAV file into music sheets for a long time. I abandoned Fourier transforms too because a simple simulation showed me  that it would not work in case of polyphonic tracks. At the moment I do not have any idea what to do next.
m777182

RE: Audio File Question

Hi m777182,

I'm afraid, unfortunately, I don't have much insight to share that would point you in the right direction. My main thought was that I might be able to decode some blinding fast Eddie Van Halen solo or such, based on the idea that the loudest instrument masks the others and would contain most of the signal power. From what I have read, this is one of the priciples employed in audio file compression. I thought it might work in this application too, but perhaps not.

The site I referenced in my previous post, from the Stanford U. Center for Computer Research in Music and Acoustics, seems to be a fairly comprehensive resource on the subject. Have you looked there?

Good Luck,
Greg Hansen

RE: Audio File Question

You're much better off finding someone with perfect pitch AND who can translate what they hear into sheet music.  The human brain and ear are simply much better at this than any algorithm you can come up with.

TTFN



RE: Audio File Question

(OP)
Hi Greg and m777182,

I'm writing a sound analyzing program because I have some ideas about how to write some artificial intelligence programs and I want to see if I can write a speech recognition program that actually works well. But before I do that I want to graph certain features of sound files to make certain I'm using the best approach for this. I know there are spectrogram programs out there, but there are certain things I want to look at that they aren't good at.

m777182 I wish I had a good suggestion on how to recognize musical notes but I haven't even looked at what makes a sound sound on-key or off-key or any of that stuff. I've only looked at human speech. The only thing I would suggest is to download a spectrogram program if you haven't already and try to find patterns in musical notes that your program can analyze. Also if you are trying to analyze singing it might help to look at these human speech pages:

http://speech.bme.ogi.edu/tutordemos/SpectrogramReading/spectrogram_reading.html
http://speech.bme.ogi.edu/tutordemos/SpectrogramReading/ipa/ipahome.html


It might be a while but when I'm done with the graphing program I could give you guys a copy. I don't think I'm going to try to sell it so be warned it probably won't look like a nice polished program.


good luck,
Mike

RE: Audio File Question

This is notice to Gregha04756 and MikeMM:
I have made a small step in my endeavours to transcribe Wave files into music sheets. The clue is perhaps the wavelet transformation of may wave file. I do not need to look for ALL frequencies because beyond 16kHZ we normally do not hear and the lowest time interval is about one quarter or perhaps one eight of the measure. So the time slices of one quarter lenght in time domain would be the samples that are statisticaly stationar and over this time range you perform FFTs. Again you are not interested for ALL frequencies but only for those that are near to the "in tune" frequencies. So we look for spectral lines that correspond to elements of the tonal system. I am making further experiments.
regards
m777182

RE: Audio File Question

I do not see how wavelets (at least in themselves) are going to help solve your problem.  From a pure frequency/time standpoint, wavelets and FFTs are similar in that they tell you what frequencies exist in the material and at what time.  The difference is in the way they accomplish this goal and their accuracies.  FFTs provide a constant resolution at all times/frequencies.  Wavlets provide good time/poor frequency resolution at high frequencies, and the opposite at low frequencies.

Dan - Owner
http://www.Hi-TecDesigns.com

RE: Audio File Question

Thanks, Macgyvwers2000, for your comment. The point is that, as you said, FFTs provide a constant resolution over entire time interval and for all frequencies, but I am not interested for all frequences but for those only, that correspond to particular tones of tonal ladder. Secondly, I am looking for dominant frequencies in a small time interval that is equal to the shortest measure (or bar) element I would like to discriminate. Here is the place open to discussion: is it one quarter  or one eight or maybe one twelve( in blues, e.g.). I think this is the way how to extract four voices of a chorus in a time slice that correspond to a smallest time interval that interests me.If I succed to extract 4 most dominant spectral lines in the first time slice(let it be one quarter), I can instead of frequencies write down names of paricular tones, like c1,e1,g1 and c2. If in the next time slice spectral lines are c1,e1,a1 and e2, then I conclude  that in first half of a bar there were a one half on c1, one half on e1 and a quarter on g1 that moved to a1 and a quarter on c2 that moved to e2. You would argue that two cosecutive spectral pictures do not tell me that voice1 kept on c1 and that it was not voice2 that moved from e1 to c1 in the second time slice. I do not know now how to manage it; however my ear and music experience will interfer at this stage, but I do not exclude the possibility that, like MikeMM pointed, some AI program will recognise the specific color of a particular voice- maybe through higher harmonics. In that case a step of voice1 will be estimated by a simultaneous step of all his specific harmonics. But this will take me some time- after all I am neither a specialist for programming nor for signal processing. These activities of mine are intended to support my hobby efforts to make good music sheets.
m777182

RE: Audio File Question

The wavelet transform could possibly be of use to you.  The kernel of the wavelet transform is a generic form and changing it changes the type of wavelet you form (Harr, Morlet, Meyer, Shannon, etc.)  There has been work of forming kernels that produce results similar to 1/3 or 1/12 octave filters and so forth.  

So finding the right wavelet to apply might help in finding the particular tones you're interested in.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Eng-Tips Forums free from inappropriate posts.
The Eng-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Eng-Tips forums is a member-only feature.

Click Here to join Eng-Tips and talk with other members! Already a Member? Login


Resources

Low-Volume Rapid Injection Molding With 3D Printed Molds
Learn methods and guidelines for using stereolithography (SLA) 3D printed molds in the injection molding process to lower costs and lead time. Discover how this hybrid manufacturing process enables on-demand mold fabrication to quickly produce small batches of thermoplastic parts. Download Now
Design for Additive Manufacturing (DfAM)
Examine how the principles of DfAM upend many of the long-standing rules around manufacturability - allowing engineers and designers to place a part’s function at the center of their design considerations. Download Now
Taking Control of Engineering Documents
This ebook covers tips for creating and managing workflows, security best practices and protection of intellectual property, Cloud vs. on-premise software solutions, CAD file management, compliance, and more. Download Now

Close Box

Join Eng-Tips® Today!

Join your peers on the Internet's largest technical engineering professional community.
It's easy to join and it's free.

Here's Why Members Love Eng-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close