S-JIS[2011-04-11/2011-04-12] 変更履歴

Scala 減算実験

Scalaでコレクション内の値の差を算出する方法を考えてみる。

計算方法 [2011-04-12]
実行時間 [2011-04-12]

合算実験

差を計算する方法

リストの中に並んでいる隣り合った数値同士の差を計算する方法を考えてみる。
計算対象データは以下の様なものとする。

scala> val list = List.tabulate(10+1)(i => i*i)
list: List[Int] = List(0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100)

forループによる計算

まずはとっても手続き型っぽい方法。

for (i <- 1 until list.size) {
  println(list(i) - list(i-1))
}

せめてyieldを使ってリストにしてみよう。（結果はVectorになったが＾＾；）

scala> for (i <- 1 until list.size) yield { list(i) - list(i-1) }
res1: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

しかしどう考えても実行効率が悪い。
配列なら添字を指定するアクセスは高速だが、Listでは最悪。Listのイテレート（要素の繰り返し処理）で何とかしたいところ。

mapメソッドによる計算

mapメソッドを使って、List内の全要素を順番に計算していく方法。

def f(list: List[Int]) = {
  var m = list.head
  list.tail.map{ n=> val sub = n - m; m = n; sub }
}

scala> f(list)
res2: List[Int] = List(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

出来たけど、ひどいソースだね(苦笑)

whileループを使った計算

whileループにしてみるか…。

def f(list: List[Int]) = {
  var r: List[Int] = Nil

  var m = list.head
  val i = list.tail.iterator
  while(i.hasNext) {
    val n = i.next
    r = n - m :: r
//  r = (n - m) +: r
    m = n
  }

  r.reverse
}

scala> f(list)
res3: List[Int] = List(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

うわ、もっと酷い(爆)

それにしても個人的にreverseはなんとなく好きじゃないので、ListBufferにしてみる。

def f(list: List[Int]) = {
  val r = scala.collection.mutable.ListBuffer.empty[Int]

  var m = list.head
  val i = list.tail.iterator
  while(i.hasNext) {
    val n = i.next
    r += n - m
    m = n
  }

  r.toList
}

scala> f(list)
res4: List[Int] = List(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

うーむ、不変オブジェクトをvarで更新して（差し替えて）いくのと、可変オブジェクトをvalで保持するのは、どちらが好まれるんだろう？＾＾；

値のペアを作って計算

発想を変えて、値のペアのリストを作ることを考えてみる。ペアにすれば、その差を計算するのは簡単だから。
隣り合った要素を両方同時に処理できるメソッドと言ったら、foldLeftとかreduceLeftかな？

scaka> list.tail.foldLeft(list.head){ (m,n) => m->n }
<console>:7: error: type mismatch;
 found   : (Int, Int)
 required: Int
       list.tail.foldLeft(list.head){ (m,n) => m->n }
                                                ^

scala> list.reduceLeft{ (m,n) => m->n }
<console>:7: error: type mismatch;
 found   : (Int, Int)
 required: Int
       list.reduceLeft{ (m,n) => m->n }
                                  ^

scala> list.tail.scanLeft(list.head){ (m,n) => n - m }
res10: List[Int] = List(0, 1, 3, 6, 10, 15, 21, 28, 36, 45, 55)

うーん、上手くいかない。隣り合った要素同士を処理するメソッドって、他には覚えが無いし。

そういえば、フィボナッチ数列を作る例で、本来のリストとそれをずらしたリストをペアにする方法があったなぁ。それを使えば…

scala> list.zip(list.tail)
res12: List[(Int, Int)] = List((0,1), (1,4), (4,9), (9,16), (16,25), (25,36), (36,49), (49,64), (64,81), (81,100))

scala> list.zip(list.tail).map{ t => t._2 - t._1 }
res13: List[Int] = List(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

出来た！

…ふむ。Listのzipメソッドはその時点でListを作るだろうから、Iteratorの方が速いかもしれない。

scala> list.iterator.zip(list.tail).map{ t => t._2 - t._1 }
<console>:7: error: type mismatch;
 found   : List[Int]
 required: Iterator[?]
       list.iterator.zip(list.tail).map{ t => t._2 - t._1 }
                              ^

Iteratorのzipメソッドは、引数もIteratorでないといけないらしい。

scala> list.iterator.zip(list.tail.iterator).map{ t => t._2 - t._1 }
res15: Iterator[Int] = non-empty iterator

おっと、最終的に計算してやらないとダメか。

scala> list.iterator.zip(list.tail.iterator).map{ t => t._2 - t._1 }.toList
res16: List[Int] = List(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

もうひとつ、slidingメソッドというのがあるのを発見。[2011-04-12]

scala> list.sliding(2)
res10: Iterator[List[Int]] = non-empty iterator

scala> list.sliding(2).toList
res11: List[List[Int]] = List(List(0, 1), List(1, 4), List(4, 9), List(9, 16), List(16, 25), List(25, 36), List(36, 49), List(49, 64), List(64, 81), List(81, 100))

おお、ちゃんとペアになってるじゃん！

scala> list.sliding(2).map{ l => l.tail.head - l.head }.toList
res14: List[Int] = List(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

↑リストの先頭はheadで取れるが、2個目の要素はtail.headとなる。

ちなみに、Lispだとリストの先頭要素（head）はcar「カー」、残りリスト（tail）はcdr「クダー」で取得する。(car list)と(cdr list)といった感じ。
なので“次の要素”を取るためには(car (cdr list))になる。Lispではこういった操作が多いので、これをまとめて行うcadrという関数があったりする。(cadr list)。
その系統でcaar・cdar・cadr・cddrとか、caaarとか、いっぱいある。
で、さすがのScalaでもcadr相当のメソッドは無いようだ(笑)

そういえば関数リテラルではパターンマッチが使えるので、そっちを使った方が分かりやすいかな。

scala> list.sliding(2).map{ case List(m, n) => n - m }.toList
res15: List[Int] = List(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

再帰関数による計算

さて、例によって再帰関数で出来ないか考えてみよう。

def f(m: Int, list: List[Int]) {
  if (list.nonEmpty) {
    println(list.head - m)
    f(list.head, list.tail)
  }
}

f(list.head, list.tail)

うむ、良さそう。
ではprintlnを外してListを返すようにしてみる。

def f(m: Int, list: List[Int]): List[Int] = {
  if (list.nonEmpty) {
    list.head - m :: f(list.head, list.tail)
  } else {
    Nil
  }
}

scala> f(list.head, list.tail)
res21: List[Int] = List(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

あとはこいつを末尾再帰に変更する。

def f(list: List[Int]) = {
  @scala.annotation.tailrec
  def g(m: Int, list: List[Int], r: List[Int]): List[Int] = {
    if (list.nonEmpty) {
      g(list.head, list.tail, list.head - m :: r)
    } else {
      r
    }
  }

  g(list.head, list.tail, Nil).reverse
}

scala> f(list)
res27: List[Int] = List(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

ついでに、whileのときと同様に、reverseを回避する為にListBufferにしてみる。

def f(list: List[Int]) = {
  import scala.collection.mutable.ListBuffer

  @scala.annotation.tailrec
  def g(m: Int, list: List[Int], r: ListBuffer[Int]): List[Int] = {
    if (list.nonEmpty) {
      g(list.head, list.tail, r += list.head - m)
    } else {
      r.toList
    }
  }

  g(list.head, list.tail, ListBuffer.empty)
}

scala> f(list)
res32: List[Int] = List(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

なんかこうなると、ListBufferを引数で受け渡す意味が無いような気もする(苦笑)
外出しにしてしまえ。

def f(list: List[Int]) = {
  val r = scala.collection.mutable.ListBuffer.empty[Int]

  @scala.annotation.tailrec
  def g(m: Int, list: List[Int]) {
    if (list.nonEmpty) {
      r += list.head - m
      g(list.head, list.tail)
    }
  }

  g(list.head, list.tail)
  r.toList
}

scala> f(list)
res33: List[Int] = List(1, 3, 5, 7, 9, 11, 13, 15, 17, 19)

実行速度の比較

では、簡易ベンチマークで速度を比較してみよう。（subtract.scala）

val list = List.tabulate(10000+1)(i => i*i)
benchmark(10, 100){ f(list) }.report

方法	実行時間[ms]		備考
方法	測定値	平均	備考
map	359, 266, 281, 250, 250, 250, 250, 250, 250, 265	258
while・reverse	219, 203, 203, 219, 218, 203, 219, 219, 219, 219	215	この差は誤差の範囲。
while・ListBuffer	234, 219, 203, 203, 203, 218, 218, 203, 203, 219	211	この差は誤差の範囲。
zip	687, 671, 688, 672, 672, 687, 672, 671, 672, 672	676	読みどおり、Iteratorを使う方が若干速い。
zip・Iterator	578, 563, 563, 578, 547, 562, 562, 547, 562, 563	563	読みどおり、Iteratorを使う方が若干速い。
sliding	4516, 4437, 4422, 4407, 4406, 4406, 4422, 4390, 4407, 4422	4416	[2011-04-12]
再帰関数・reverse	234, 219, 218, 234, 235, 219, 234, 219, 219, 219	225
再帰関数・ListBuffer引数渡し	203, 188, 172, 187, 172, 188, 172, 172, 187, 187	182
再帰関数・ListBuffer外出し	281, 172, 171, 188, 172, 172, 172, 172, 172, 172	174

一番すっきり書けたzip方式は、（iteratorを使って若干速くなっても）一番遅いorz
後から思い付いたsliding方式もすっきり書けてるんだけど、桁違いに遅かった(苦笑) [2011-04-12]

この中では再帰関数を使う方式が一番速い。つくづく、Listは再帰関数と相性が良いな(笑)

次点はwhileループ方式か。
List以外のコレクションでIteratorが速い奴がいれば、whileループを使うのが速いのではないかと思う。
つまり候補はArrayやVectorだけど、こいつら（特に配列（Array））は添字でアクセスできるので、for方式も速いかもしれない。

Scala実験へ戻る / Scala目次へ戻る / 技術メモへ戻る

メールの送信先：ひしだま