alexa-sdk + ssml-builderでAlexa Skilの発話をカスタマイズする

この記事は一人Alexa Skills Kit for Node.js Advent Calendar 2017の13日目の記事です。 ssml-builder(npm)を使ってみる機会がありましたので、使い方などを紹介 […]

この記事は一人Alexa Skills Kit for Node.js Advent Calendar 2017の13日目の記事です。

ssml-builder(npm)を使ってみる機会がありましたので、使い方などを紹介します。

インストール

$ npm init -y
$ npm install alexa-sdk ssml-builder --save

Hello World Skillを準備する

まずは「Hello World」と返答するスキルを作ります。

index.jsのソース

const Alexa = require('alexa-sdk')
const handlers = {
  LaunchRequest: function () {
    this.emit(':tell', 'Hello World')
  }
}

module.exports.handler = function (event, context, callback) {
  const alexa = Alexa.handler(event, context)
  alexa.registerHandlers(handlers)
  alexa.execute()
}

SSMLの変化を確認するため、ユニットテストを準備する

変更するたびにソースをデプロイするのも手間なので、ローカルでサクッとテストをまわして検証します。

テストライブラリの追加

$ npm install mocha power-assert --save-dev

index.test.jsのソース

/* global describe, it */
// ライブラリの読み込み
const assert = require('power-assert')

// テスト対象ファイルのロード
const MyLambdaFunction = require('./index.js')
const { handler } = MyLambdaFunction

// Lambdaのダミーevent
const event = {
  'session': {
    'new': true,
    'sessionId': 'amzn1.echo-api.session.[unique-value-here]',
    'attributes': {},
    'user': {
      'userId': 'amzn1.ask.account.[unique-value-here]'
    },
    'application': {
      'applicationId': 'amzn1.ask.skill.[unique-value-here]'
    }
  },
  'version': '1.0',
  'request': {
    'locale': 'en-US',
    'timestamp': '2016-10-27T18:21:44Z',
    'type': '',
    'requestId': 'amzn1.echo-api.request.[unique-value-here]'
  },
  'context': {
    'AudioPlayer': {
      'playerActivity': 'IDLE'
    },
    'System': {
      'device': {
        'supportedInterfaces': {
          'AudioPlayer': {}
        }
      },
      'application': {
        'applicationId': 'amzn1.ask.skill.[unique-value-here]'
      },
      'user': {
        'userId': 'amzn1.ask.account.[unique-value-here]'
      }
    }
  }
}

// Failした場合のハンドリング
const fail = (e) => {
  if (e.name === 'AssertionError') {
    assert.deepEqual(e.expected, e.actual)
  } else {
    assert.deepEqual(e, undefined)
  }
}

// テストコード
describe('hello alexa', () => {
  it('say hallo world', () => {
    event.request.type = 'LaunchRequest'
    const succeed = (data) => {
      const { response } = data
      const {
        outputSpeech
      } = response
      assert.deepEqual(
        outputSpeech,
        {
          type: 'SSML',
          ssml: '<speak> Hello World! </speak>'
        }
      )
    }
    // eslint-disable-next-line handle-callback-err
    handler(event, {succeed, fail}, (error, data) => {})
  })
})

テストする

デフォルトでは<speak> Hello World! </speak>というマークアップになるため、テストは成功します。

$ ./node_modules/mocha/bin/_mocha index.test.js 


  LaunchRequest
Warning: Application ID is not set
    ✓ Say hallo world


  1 passing (12ms)

SSMLをカスタマイズする

いよいよSSML Builderの出番です。

`Hello World`をそのまま発話させる

まずは<speak> Hello World! </speak>をSSML Builderで作ってみました。

// ライブラリ読み込み
const Speech = require('ssml-builder')

const handlers = {
  LaunchRequest: function () {
    // 初期化
    const speech = new Speech()
    // 発話内容を追加
    speech.say('Hello World!')
    // SSML化
    const speechOutput = speech.ssml(true)
    // 返答する
    this.emit(':tell', speechOutput)
  }
}

`Hello`と`World`の間に1秒間をいれる

speech.pause('1s')と書くことで、指定した秒数だけ間をいれることができます。

const handlers = {
  LaunchRequest: function () {
    const speech = new Speech()
    speech.say('Hello')
    speech.pause('1s')
    speech.say('World!')
    const speechOutput = speech.ssml(true)
    this.emit(':tell', speechOutput)
  }
}

テストを実行させると、<speak> Hello <break time=\'1s\'/> World! </speak>というマークアップに変わっていることがわかります。

  0 passing (13ms)
  1 failing

  1) LaunchRequest
       Say hallo world:

      AssertionError: { type: 'SSML', ssml: '<speak> Hello World! </speak>' } deepEqual { type: 'SSML',
  ssml: '<speak> Hello <break time=\'1s\'/> World! </speak>' }
      + expected - actual

       {
      -  "ssml": "<speak> Hello World! </speak>"
      +  "ssml": "<speak> Hello <break time='1s'/> World! </speak>"
         "type": "SSML"
       }

いろいろ遊んでみる

こんな感じで、文章の区切りをつけたり電話番号の読み上げに対応したりもできます。

const handlers = {
  LaunchRequest: function () {
    const speech = new Speech()
    speech.paragraph('Hello World !')
    speech.pause('1s')
    speech.sentence('This is a sentence')
    speech.sentence('There should be a short pause before this second sentence')
    speech.say('Telephone number is')
    speech.sayAs({
      'word': '+39(011)777-7777',
      'interpret': 'telephone',
      'format': '39'
    })
    const speechOutput = speech.ssml(true)
    this.emit(':tell', speechOutput)
  }
}

出力

<speak> <p>Hello World !</p> <break time='1s'/> <s>This is a sentence</s> <s>There should be a short pause before this second sentence</s> Telephone number is <say-as interpret-as='telephone' format='39'>+39(011)777-7777</say-as> </speak>

独自の変換機能も

speech.spellSlowlyなど独自の関数も用意されています。

これを使用すると、入力したワードに一文字ずつ<break/>タグを挿入してくれます。

const handlers = {
  LaunchRequest: function () {
    const speech = new Speech()
    speech.spellSlowly('Say as slowly', '500ms')
    const speechOutput = speech.ssml(true)
    this.emit(':tell', speechOutput)
  }
}

出力タグ

<speak> <say-as interpret-as='spell-out'>S</say-as> <break time='500ms'/> <say-as interpret-as='spell-out'>a</say-as> <break time='500ms'/> <say-as interpret-as='spell-out'>y</say-as> <break time='500ms'/> <say-as interpret-as='spell-out'> </say-as> <break time='500ms'/> <say-as interpret-as='spell-out'>a</say-as> <break time='500ms'/> <say-as interpret-as='spell-out'>s</say-as> <break time='500ms'/> <say-as interpret-as='spell-out'> </say-as> <break time='500ms'/> <say-as interpret-as='spell-out'>s</say-as> <break time='500ms'/> <say-as interpret-as='spell-out'>l</say-as> <break time='500ms'/> <say-as interpret-as='spell-out'>o</say-as> <break time='500ms'/> <say-as interpret-as='spell-out'>w</say-as> <break time='500ms'/> <say-as interpret-as='spell-out'>l</say-as> <break time='500ms'/> <say-as interpret-as='spell-out'>y</say-as> <break time='500ms'/> </speak>

会社名や人名などスペルを伝えたい場合に使うとよさそうです。

おわりに

「SSMLくらい自力で組めばいいのでは？」と思ったのですが、実際に使ってみると独自のヘルパー関数があることもあってかなり便利です。