2013-02-07 3 views
0

Я пытаюсь создать дерево Хаффмана с помощью PHP.Создание дерева Хаффмана в PHP

Это то, что у меня есть:

<!DOCTYPE html> 
<html> 
    <body> 
     <?php 
      $extract = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse elementum imperdiet aliquet. Duis non molestie orci. Ut eget nibh nec augue ultricies porttitor.'; 
      $characters = count_chars($extract, 1);    
      asort($characters); 
      foreach($characters as $character => $occurrence) 
      { 
       echo 'There'; 
       if($occurrence > 1) 
       { 
        echo ' were ' . $occurrence . ' occurrences of '; 
       } 
       else 
       { 
        echo ' was '. $occurrence . ' occurrence of '; 
       } 
       echo '"<strong>' . chr($character) . '</strong>" in the extract.<br />'; 
       $characterFreq[chr($character)] = $occurrence;    
      } 
      print_r($characterFreq); 
     ?> 
    </body> 
</html> 

Какие выходы:

There was 1 occurrence of "S" in the extract. 
There was 1 occurrence of "U" in the extract. 
There was 1 occurrence of "h" in the extract. 
There was 1 occurrence of "L" in the extract. 
There was 1 occurrence of "D" in the extract. 
There was 1 occurrence of "," in the extract. 
There was 1 occurrence of "q" in the extract. 
There was 1 occurrence of "b" in the extract. 
There were 3 occurrences of "g" in the extract. 
There were 4 occurrences of "a" in the extract. 
There were 4 occurrences of "." in the extract. 
There were 4 occurrences of "d" in the extract. 
There were 5 occurrences of "p" in the extract. 
There were 6 occurrences of "c" in the extract. 
There were 6 occurrences of "l" in the extract. 
There were 7 occurrences of "m" in the extract. 
There were 8 occurrences of "n" in the extract. 
There were 8 occurrences of "r" in the extract. 
There were 9 occurrences of "u" in the extract. 
There were 9 occurrences of "o" in the extract. 
There were 10 occurrences of "s" in the extract. 
There were 15 occurrences of "t" in the extract. 
There were 17 occurrences of "i" in the extract. 
There were 20 occurrences of "e" in the extract. 
There were 22 occurrences of " " in the extract. 
Array ([S] => 1 [U] => 1 [h] => 1 [L] => 1 [D] => 1 [,] => 1 [q] => 1 [b] => 1 [g] => 3 [a] => 4 [.] => 4 [d] => 4 [p] => 5 [c] => 6 [l] => 6 [m] => 7 [n] => 8 [r] => 8 [u] => 9 [o] => 9 [s] => 10 [t] => 15 [i] => 17 [e] => 20 [ ] => 22) 

Я использую смеси array_slice(), array_splice() и array_unshift(), но у меня возникают проблемы с рекурсией.

В идеале, листья и ветви будут обозначаться индексами массива 0 & 1.

Любые идеи о том, как сделать дерево Хаффмана в многомерном виде массива будет весьма признателен.

+0

кто готов предложить что-нибудь ..? – verbumSapienti

+1

ну, по крайней мере, он заработал мне значок перелива! – verbumSapienti

ответ

0

Это полное решение проблемы, в PHP:

function huffmannEncode($string) { 
    $originalString = $string; 
    $occurences = array(); 

    while (isset($string[0])) { 
     $occurences[] = array(substr_count($string, $string[0]), $string[0]); 
     $string = str_replace($string[0], '', $string); 
    } 

    sort($occurences); 
    while (count($occurences) > 1) { 
     $row1 = array_shift($occurences); 
     $row2 = array_shift($occurences); 
     $occurences[] = array($row1[0] + $row2[0], array($row1, $row2)); 
     sort($occurences); 
    } 

    // $dictionary is an array that gets filled with the values with a recursive method 
    $dictionary = []; 
    fillDictionary($dictionary, is_array($occurences[0][1]) ? $occurences[0][1] : $occurences); 

    // Generate the final encoded message 
    $encoded = ''; 
    for($i = 0; $i < strlen($originalString); $i++) { 
     $encoded .= $dictionary[$originalString[$i]]; 
    } 
    return $encoded; 
} 

// This function runs recursively to generate the Huffmann tree 
function fillDictionary(&$dictionary, $data, $value = '') { 
    if (!is_array($data[0][1])) { 
     $dictionary[$data[0][1]] = $value.'0'; 
    } else { 
     fillDictionary($dictionary, $data[0][1], $value.'0'); 
    } 
    if (isset($data[1])) { 
     if (!is_array($data[1][1])) { 
      $dictionary[$data[1][1]] = $value.'1'; 
     } else { 
      fillDictionary($dictionary, $data[1][1], $value.'1'); 
     } 
    } 
} 

Эта функция вычисляет вхождения, и с рекурсивной подфункцией он создает словарь, который присваивает двоичный код каждого символ.

Вот пример:

// Test the functionality: 
echo huffmannEncode('hello world'); 
// Output: 00100010101101110011110010101111 
// And the dictionary: 
/* 
[e] => 000 
[h] => 001 
[r] => 010 
[w] => 011 
[l] => 10 
[o] => 110 
[ ] => 1110 
[d] => 1111 
*/