purplelilgirl

makes games & other things

Posts tagged avaudiorecorder

2 notes

Tutorial: The step two to making a ‘Talking’ iPhone app, when to record and when to stop recording

This post is related to the following posts:

The ‘Talking’ app, you say something and an animal repeats what you say in a cute voice.

Well, we can’t really ask the player to tap the animal to make it record, we want the animal to simply record something when the player say something, and then stop recording when the player stopped talking, and then play it. So how do we detect if the player stopped talking?

How to start recording when detecting sound, and stop recording when detect silence?

From Stack Overflow:

Perhaps you could use the AVAudioRecorder’s support for audio level metering to keep track of the audio levels and enable recording when the levels are above a given threshold. You’d need to enable metering with:

[anAVAudioRecorder setMeteringEnabled:YES];

and then you could periodically call:

[anAVAudioRecorder updateMeters];
power
= [anAVAudioRecorder averagePowerForChannel:0];
if (power > threshold && anAVAudioRecorder.recording==NO) {
   
[anAVAudioRecorder record];
} else if (power < threshold && anAVAudioRecorder.recording==YES) {
   
[anAVAudioRecorder stop];
}

Or something like that.

Source: http://stackoverflow.com/questions/3855919/ios-avaudiorecorder-how-to-record-only-when-audio-input-present-non-silence

According to the API, averagePowerForChannel returns the average power of the sound being recorded. If it returns 0 that means that recording is at its full scale, the maximum power (like when someone shouts really really loudly into the mic?), while -160 is the minimum power or near silence (which is what we want right, near silence?).

Another tutorial (Tutorial: Detecting When a User Blows into the Mic by Dan Grigsby), you can also use peakPowerForChannel. He made an algorithm to get the lowPassResults of the audio input:

From the tutorial:

Each time the timer’s callback method is triggered the lowPassResults level variable is recalculated. As a convenience, it’s converted to a 0-1 scale, where zero is complete quiet and one is full volume.

We’ll recognize someone as having blown into the mic when the low pass filtered level crosses a threshold. Choosing the threshold number is somewhat of an art. Set it too low and it’s easily triggered; set it too high and the person has to breath into the mic at gale force and at length. For my app’s need, 0.95 works.

- (void)listenForBlow:(NSTimer *)timer {
[recorder updateMeters];

const double ALPHA = 0.05;
double peakPowerForChannel = pow(10, (0.05 * [recorder peakPowerForChannel:0]));
lowPassResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * lowPassResults;

if (lowPassResults > 0.95)
NSLog(@"Mic blow detected");

}

Source: http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/

So I am using this Dan’s algorithm, except the the threshold number, I’m still testing it out, it really is somewhat of an art.

Okay, now we know when the player STOPS talking, what about when the user starts talking? We wouldn’t be able to know that since we stopped recording after the player stops talking, right? We won’t be able to get the power for the channel with a stopped recorder.

And StackOverflow comes to the rescue again, I read somewhere that you should have TWO AVAudioRecorders, instead of ONE. One AVAudioRecorder to monitor the power for channel at all times and one to actually record your player’s voice.

So we have:

NSURL *monitorTmpFile;
NSURL *recordedTmpFile;
AVAudioRecorder *recorder;
AVAudioRecorder *audioMonitor;

And some booleans to keep track of when it is recording or playing:

BOOL isRecording;
BOOL isPlaying;

We have to initialize both controllers, somewhere in your init add:

[self initAudioMonitor];
[self initRecorder];

The functions:

-(void) initAudioMonitor
{    NSMutableDictionary* recordSetting = [[NSMutableDictionary alloc] init];
    [recordSetting setValue :[NSNumber numberWithInt:kAudioFormatAppleIMA4] forKey:AVFormatIDKey];
    [recordSetting setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey];
    [recordSetting setValue:[NSNumber numberWithInt: 1] forKey:AVNumberOfChannelsKey];
   
    NSArray* documentPaths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
    NSString* fullFilePath = [[documentPaths objectAtIndex:0] stringByAppendingPathComponent: @”monitor.caf”];
    monitorTmpFile = [NSURL fileURLWithPath:fullFilePath];
   
    audioMonitor = [[ AVAudioRecorder alloc] initWithURL: monitorTmpFile settings:recordSetting error:&error];
   
    [audioMonitor setMeteringEnabled:YES];
   
    [audioMonitor setDelegate:self];
   
    [audioMonitor record];
}

-(void) initRecorder
{    NSMutableDictionary* recordSetting = [[NSMutableDictionary alloc] init];
    [recordSetting setValue :[NSNumber numberWithInt:kAudioFormatAppleIMA4] forKey:AVFormatIDKey];
    [recordSetting setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey];
    [recordSetting setValue:[NSNumber numberWithInt: 1] forKey:AVNumberOfChannelsKey];
   
    NSArray* documentPaths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
    NSString* fullFilePath = [[documentPaths objectAtIndex:0] stringByAppendingPathComponent: @”in.caf”];
    recordedTmpFile = [NSURL fileURLWithPath:fullFilePath];
   
    recorder = [[ AVAudioRecorder alloc] initWithURL: recordedTmpFile settings:recordSetting error:&error];
   
    [recorder setMeteringEnabled:YES];
   
    [recorder setDelegate:self];
   
    [recorder prepareToRecord];
}

And then we have a function that will be called all the time, to monitor your AVAudioRecorders, call it somewhere in your update:

-(void) monitorAudioController: (ccTime) dt
{  
    if(!isPlaying)
    {   [audioMonitor updateMeters];
   
        // a convenience, it’s converted to a 0-1 scale, where zero is complete quiet and one is full volume
        const double ALPHA = 0.05;
        double peakPowerForChannel = pow(10, (0.05 * [audioMonitor peakPowerForChannel:0]));
        double audioMonitorResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * audioMonitorResults;
   
        NSLog(@”audioMonitorResults: %f”, audioMonitorResults);
       
        if (audioMonitorResults > AUDIOMONITOR_THRESHOLD)
        {    NSLog(@”Sound detected”);
       
            if(!isRecording)
            {   [audioMonitor stop];
                [self startRecording];
            }
        }   else
        {   NSLog(@”Silence detected”);
            if(isRecording)
            {   if(silenceTime > MAX_SILENCETIME)
                {   NSLog(@”Next silence detected”);
                    [audioMonitor stop];
                     [self stopRecordingAndPlay];
                    silenceTime = 0;
                }   else
                {   silenceTime += dt;
                }
            }
        }
       
        if([audioMonitor currentTime] > MAX_MONITORTIME)
        {   [audioMonitor stop];
            [audioMonitor record];
        }
    }
}

Okay, lemme explain…

You have to call [audioMonitor updateMeters], because (according to AVAudioRecorder class reference):

Refreshes the average and peak power values for all channels of an audio recorder.

And then, do you see Dan’s algorithm?

const double ALPHA = 0.05;
double peakPowerForChannel = pow(10, (0.05 * [audioMonitor peakPowerForChannel:0]));
 double audioMonitorResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * audioMonitorResults;

NSLog(@”audioMonitorResults: %f”, audioMonitorResults);

If audioMonitorResults is greater than our threshold AUDIOMONITOR_THRESHOLD (to get this value, requires many hours of testing and monitoring, that’s why I have a NSLog there), that means we have detected sound. And we start recording!

if(!isRecording)
{   [audioMonitor stop];
    [self startRecording];
}

If it isn’t already recording, we stop the audio monitor and start recording:

-(void) startRecording
{   NSLog(@”startRecording”);
   
    isRecording = YES;
    [recorder record];
}

Okay then, if the audioMonitorResults is less than the AUDIOMONITOR_THRESHOLD and we are recording, it means that silence has been detected, but but but, we do not stop the recording at once. Why…? Because when people are speaking, we speak like this: “Hello, how are you?” instead of “Hellohowareyou”, you see the spaces between each word are also detected as silences, which is why:

if(isRecording)
{   if(silenceTime > MAX_SILENCETIME)
     {   NSLog(@”Next silence detected”);
         [audioMonitor stop];
         [self stopRecordingAndPlay];
         silenceTime = 0;
     }   else
     {   silenceTime += dt;
}

MAX_SILENCETIME is threshold for the silence time between words.

And then to make sure the size of our audioMonitor output will not explode:

if([audioMonitor currentTime] > MAX_MONITORTIME)
{   [audioMonitor stop];
    [audioMonitor record];
}

It saves the file after MAX_MONITORTIME.

And then stopRecordingAndPlay:

-(void) stopRecordingAndPlay
{    NSLog(@”stopRecording Record time: %f”, [recorder currentTime]);
   
    if([recorder currentTime] > MIN_RECORDTIME)
    {   isRecording = NO;
        [recorder stop];
       
        isPlaying = YES;
        // insert code for playing the audio here
    }   else
    {   [audioMonitor record];
    }
}

After the audio is played, call:

-(void) stopPlaying
{   isPlaying = NO;
    [audioMonitor record];
}

And there we go! :)

To summarize:

  • 1 AVAudioRecorder to monitor when the player starts talking and stops talking
  • 1 AVAudioRecorder to record the player’s voice
  • Use peakPowerForChannel to detect talking or silence

And that’s about it!

Filed under app avaudiorecorder check metering detect silence detect sound openal stack overflow talking ios iphone sdk

2 notes

Tutorial: The first step to making a ‘Talking’ iPhone app, chipmunkifying your voice!

Yah, I invented a word, “chipmunkifying”, or to make your voice sound like a chipmunk. Why? ‘Talking’ apps are all over the place in the iPhone app store, there’s a talking tom cat, talking bird, talking hippo, talking giraffe, whatever animal you can think of… Basically what those apps do is, you say something, and then the animal will repeat it, in this chipmunk like voice. Oh, you can poke, tickle, hit it too, or whatever…

Oh, I am using Cocos2D as my game engine. So, begin by creating a project using the Cocos2D application template. If you are not familiar with Cocos2D, you can go to its website for instructions on how to download and install it.

Well, so the first step to a ‘Talking’ app is of course, it has to record what you say.

I’ll be using AVAudioRecorder to record my voice, it’s really simple to set up, just follow the intructions on this blog by Jake Wyman. But he uses the iPhone SDK, while I will be using Cocos2D. So just follow his tutorial up to the part of adding frameworks:

From your XCode interface you are going to select the Frameworks folder, ctl>click and choose ADD and then select Existing Frameworks… Then choose both the CoreAudio.Framework and the AVFoundation.Framework

And after we have done that, some coding… Rather copy paste some code from Jake’s.

First create a NSObject class, named AudioController.

Import AVFoundation and CoreAudio:

#import <AVFoundation/AVFoundation.h>
#import <CoreAudio/CoreAudioTypes.h>

Set up AudioController as a AVAudioRecorderDelegate. And declare an AVAudioRecorder, and a string for recordedTmp (the file path where we will temporarily store our audio).

@interface AudioController : NSObject <AVAudioRecorderDelegate>

{   AVAudioRecorder * recorder;

   NSString *recordedTmpFile;

}

We instantiate an AVAudioSession in a function called initAudioController (basically the code inside Jake’s viewDidLoad):

- (void) initAudioController
{   //Instanciate an instance of the AVAudioSession object.
    AVAudioSession * audioSession = [AVAudioSession sharedInstance];
    //Setup the audioSession for playback and record.
    //We could just use record and then switch it to playback leter, but
    //since we are going to do both lets set it up once.
    [audioSession setCategory:AVAudioSessionCategoryPlayAndRecord error: &error];
    //Activate the session
    [audioSession setActive:YES error: &error];
}

And then our record function:

-(void) record
{   NSMutableDictionary* recordSetting = [[NSMutableDictionary alloc] init];
   [recordSetting setValue :[NSNumber numberWithInt:kAudioFormatAppleIMA4] forKey:AVFormatIDKey];
   [recordSetting setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey];
   [recordSetting setValue:[NSNumber numberWithInt: 2] forKey:AVNumberOfChannelsKey];
       
   recordedTmpFile = [NSURL fileURLWithPath:[NSTemporaryDirectory() stringByAppendingPathComponent: [NSString stringWithString: @”recording.caf”]]];
       
   recorder = [[ AVAudioRecorder alloc] initWithURL:recordedTmpFile settings:recordSetting error:&error];
       
   [recorder setDelegate:self];
   [recorder prepareToRecord];      
   [recorder record];
}       

And now, to play whatever recorded in a chipmunk voice. In Jake’s project he uses AVAudioPlayer to play his sound, but that isn’t going to work for me, because AVAudioPlayer doesn’t allow me to change the playback speed.

So instead of using that, I will be using CocosDenshion’s CDSoundEngine. I am reading Chapter 9 : Playing Sounds With CocosDenshion of Cocos2d for iPhone 0.99 Beginner’s Guide:

According to Pablo Ruiz, we need to import more frameworks to get CDSoundEngine working:

… include OpenAL and AudioToolbox frameworks in your project.

More imports:

#import “cocos2d.h”
#import “CocosDenshion.h”

And then declare a CDSoundEngine:

CDSoundEngine *soundEngine;

In initAudioController function, we initialize the soundEngine.

soundEngine = [[CDSoundEngine alloc] init: kAudioSessionCategory_PlayAndRecord];

NSArray *defs = [NSArray arrayWithObjects: [NSNumber numberWithInt:1],nil];
[soundEngine defineSourceGroups:defs];

And then we play:

-(void) play
{   NSString *filePath = [NSTemporaryDirectory() stringByAppendingPathComponent: [NSString stringWithString: @”recording.caf”]];

   [soundEngine loadBuffer: recordedTmpFileIdx filePath: filePath];
   [soundEngine playSound: recordedTmpFileIdx sourceGroupId: 0 pitch: 2.0f pan: 0.0f gain: 1.0f loop: NO];   
}

Take note of the pitch property: it says 2.0f. What does it mean? The setting for normal pitch is 1.0f, if you increase its value, you get a higher pitch, also known as a chipmunked voice, if you decrease the pitch, you get this low kind of creepy voice.

We also need to make a function for stopping the recording and then start playing:

-(void) stopRecording
{   [recorder stop];
    [self play];
}

And then we make a function for unloading the AudioController:

- (void) unloadAudioController
{   NSFileManager * fm = [NSFileManager defaultManager];
    [fm removeItemAtPath:[recordedTmpFiles [0] path] error:&error];
[recorder dealloc];
    recorder = nil;
   [soundEngine release];

}

Okay, now we have AudioController done, it’s time to call it in our HelloWorld scene. Yes. the HelloWorld that comes with the default template. I also added a BOOL isRecording to keep trakc if we are recording or playing.

HelloWorld.h:

#import “cocos2d.h”
#import “AudioController.h”

// HelloWorld Layer
@interface HelloWorld : CCLayer
{    AudioController *audioLayer;

     BOOL *isRecording;
}

+(id) scene;

@end

For HelloWorld.m, in init, add swallowedTouches and change the “Hello World” label to “Say something…” or “Speak!” or “Talk to me” or whatever.

[[CCTouchDispatcher sharedDispatcher] addTargetedDelegate:self priority:0 swallowsTouches:YES];
       
CCLabel* label = [CCLabel labelWithString:@”Say something…” fontName:@”Marker Felt” fontSize:32];

Also in init, initialize audioLayer, and set isRecording to NO

audioLayer = [[AudioController alloc] init];
[audioLayer initAudioController];

isRecording = NO;

And then, since I am lazy to add buttons, the user simply taps the iPhone, anywhere on the iPhone once, to record and then tap again to stop recording and play the audio.

- (BOOL)ccTouchBegan:(UITouch *)touch withEvent:(UIEvent *)event

{    if(isRecording)

    {    [audioLayer stopRecording];

           isRecording = NO;

    }

    else

    {       [audioLayer record];

            isRecording = YES;

    }

}

And in HelloWorld’s dealloc add:

[audioLayer unloadAudioController];

And that’s it :)

You can record your voice and play to sound like a chipmunk :)

For any questions (or if you find any errors), feel free to contact me through Twitter, here, email, Facebook or whatever.

Filed under cocos2d avaudiorecorder cocosdenshion iphone app development talking chipmunk voice record audio play