TextToSpeechService
TextToSpeechService
will then use the body to make a call to Azure Speech Service which will return an MP4completed
branch.ValuesController
and the one we shall be editing, the SpeechController
appsettings.json
and also the models needed for the requests.ValuesController
, which I leave in for debugging purposes, so we should see ["value1","value2"]
displayed in the browser.secrets.json
file, inserting your own values in the relevant places."TwilioAccount": { "AccountSid": "ACCOUNT_SID", "AuthToken": "AUTH_TOKEN" }, "CsAccount": { "SubscriptionKey": "COGNITIVE_SERVICES_API_KEY" }, "StorageCreds": { "Account": "STORAGE_ACCOUNT_NAME", "Key": "STORAGE_ACCOUNT_API_KEY" }
appsettings.json
file, but remember the appsettings.json
file gets checked in to source control, so leave the values blank or use a reminder such as “value set in user secrets”.Startup.cs
file, you will see where I have mapped the app settings to our values.... services.Configure<TwilioAccount>(Configuration.GetSection("TwilioAccount")); services.Configure<CsAccount>(Configuration.GetSection("CsAccount")); services.Configure<StorageCreds>(Configuration.GetSection("StorageCreds")); ...
IOptions
and the Options Pattern.AuthenticationService.cs
in the Services
folder. Add the following code, where we create a new HTTP request with the Token API URI and our Azure Speech Service subscription key to receive our auth token:... public async Task<string> FetchTokenAsync() { using (var client = new HttpClient()) { client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", _subscriptionKey); var uriBuilder = new UriBuilder(FetchTokenUri); var result = await client.PostAsync(uriBuilder.Uri.AbsoluteUri, null); return await result.Content.ReadAsStringAsync(); } } ...
TextToSpeechService.cs
file and add the following code:public async Task<HttpSpeechResponse> GetSpeech(string body, string from) { var response = new HttpSpeechResponse(); //below is the endpoint I was given when I added Speech Services, you can substitute it //for the one you get var endpoint = "https://westus.tts.speech.microsoft.com/cognitiveservices/v1"; var token = await _authenticationService.FetchTokenAsync(); using (var client = new HttpClient()) { client.DefaultRequestHeaders.Add("X-Microsoft-OutputFormat", "audio-16khz-128kbitrate-mono-mp3"); client.DefaultRequestHeaders.Add("User-Agent", "autotexter"); client.DefaultRequestHeaders.Add("Authorization", token); var uriBuilder = new UriBuilder(endpoint); var text = $@" <speak version='1.0' xmlns=""http://www.w3.org/2001/10/synthesis"" xml:lang='en-US'> <voice name='Microsoft Server Speech Text to Speech Voice (en-GB, Susan, Apollo)'> You had a text message from {from} <break time = ""100ms"" /> The message was <break time=""100ms""/> {body} </voice> </speak> "; var content = new StringContent(text, Encoding.UTF8, "application/ssml xml"); var result = await client .PostAsync(uriBuilder.Uri.AbsoluteUri, content) .ConfigureAwait(false); response.Code = result.StatusCode; if (result.IsSuccessStatusCode) { // add code to save the soundbite here } return response; } }
text
and assigned a Speech Synthesis Markup Language
or SSML
string. I then passed in the from number and the message body text. You can have fun and play around with things like speed and pronunciation or even change the voice and accent of the speaker.Startup.cs
file, you will see our two new services have already been added, ready for .NET Core’s built-in dependency injection to pick up.... services.AddScoped<IAuthenticationService, AuthenticationService>(); services.AddScoped<ITextToSpeechService, TextToSpeechService>(); services.AddMvc().SetCompatibilityVersion(CompatibilityVersion.Version_2_1); ...
Scoped
as I want the instance to be around for the lifetime of the request. You can read more on the service registration options on the Microsoft documentation.TextToSpeechService.cs
that will write the MP3 to the blob and then call that in our public method. This private method will return the path to the newly stored item, then we can pass that forward to Twilio.public async Task<HttpSpeechResponse> GetSpeech(string body, string from) { ... if (result.IsSuccessStatusCode) { var stream = result.Content.ReadAsStreamAsync(); using (MemoryStream bytearray = new MemoryStream()) { stream.Result.CopyTo(bytearray); response.Path = await StoreSoundbite(bytearray.ToArray()) .ConfigureAwait(false); } } return response; } private async Task<string> StoreSoundbite(byte[] soundBite) { var blobPath = "PATH_TO_YOUR_BLOB_STORAGE"; var name = Path.GetRandomFileName(); var filename = Path.ChangeExtension(name, ".mp3"); var urlString = blobPath filename; var creds = new StorageCredentials(_storageCreds.Account, _storageCreds.Key); var blob = new CloudBlockBlob(new Uri(urlString), creds); blob.Properties.ContentType = "audio/mpeg"; if (!(await blob.ExistsAsync().ConfigureAwait(false))) { await blob .UploadFromByteArrayAsync(soundBite, 0, soundBite.Length) .ConfigureAwait(false); } return urlString; }
PATH_TO_YOUR_BLOB_STORAGE
URI to match your Azure Storage URL and Container name.containers
within the storage resource.<a href="https://your-storage-name.azure.blob.core.windows.net/your-container-name/new-file-name.jpg">https://your-storage-name.azure.blob.core.windows.net/your-container-name</a>/ //e.g. https://<STORAGE_NAME>.azure.blog.core.windows.net/<CONTAINER_NAME>/
SpeechController.cs
to accept a POST from Twilio that will kick off our conversion of text to speech.SiteUrl
, but you can do it programmatically using .NET Core’s IHttpContextAccessor.TwilioResponse
, as that is all we need to pass on to the next stage.... [HttpPost] [Route("voice")] public async Task<IActionResult> Voice([FromForm]TwilioResponse twilioResponse) { await CallResource.CreateAsync( to: new PhoneNumber("YOUR_TELEPHONE_NUMBER"), from: "TWILIO_NUMBER", url: new Uri($"{SiteUrl}/api/speech/call/{twilioResponse.MessageSid}"), method: "GET"); return Content(""); } ...
... [HttpGet] [Route("call/{messageSid}")] public async Task<TwiMLResult> Call([FromRoute]string messageSid) { var message = await MessageResource.FetchAsync(pathSid: messageSid); var response = await _textToSpeechService .GetSpeech(message.Body, message.From.ToString()); var twiml = new VoiceResponse(); twiml.Play(new Uri(response.Path)); return TwiML(twiml); } ...
Voice
action in the SpeechController
.Call
action on the SpeechController
and it will pick up the message Sid off the route.TextToSpeechService
which in turn, returns the URI of the stored soundbite.
with the port your localhost
is running on.> ngrok http <PORT_NUMBER> -host-header="localhost:<PORT_NUMBER>"
SITE_URL
in the SpeechController
with it.dotnet run
in the CLI.ValuesController
we left in from the template.https://.ngrok.io/api/values
into your browser and you should see ["value1","value2"]
displayed once again.https://.ngrok.io/api/speech/voice
from ngrok and paste into the A MESSAGE COMES IN section.