衡量程式複雜度的方法 - 創意海豚的部落格

字體：小中大

衡量程式複雜度的方法

2009/09/22 21:55:16瀏覽3726｜回應0｜推薦0

衡量程式複雜度的方法

在軟體工程的領域中，有著各式各樣衡量軟體複雜度的方法；其中，有些衡量方法雖然有其理論的論述基礎，但其計算過程則顯得有些繁瑣，例如 Function Points 與 Halstead's Software Science。以筆者的觀點而言，實用的 software metrics 除了必須具備一定程度的說服力之外，還得具備計算簡潔的特點；據此，筆者將為讀者介紹兩種 software metrics：一、cyclomatic complexity；二、information flow complexity。其中，cyclomatic complexity 是以 control flow 的觀點，去衡量程式碼的複雜度；而 information flow complexity 則是以 data flow 的觀點，去衡量程式碼的複雜度。

Cyclomatic Complexity

這個「Cyclomatic Complexity」，是由 Thomas J. McCabe 在1976 年12 月於IEEE Transactions on Software Engineering 所提出的，其主要目的在衡量一個 software module 的 decision logic 之複雜度，並可根據複雜度的數值，來預測這個 software module 的可靠性與穩定度；如果 cyclomatic complexity 的值越大，則表示該 software module 的 decision logic 越複雜，出錯的機率或不穩定的程度也越高。

如果將某個 software module 的程式碼化為 control flow graph 之後，其中有 e 個 edges，n 個 nodes，則其 decision logic 的複雜度之計算公式為：

cyclomatic complexity = e – n + 2

這個由 Thomas J. McCabe 所提出的 cyclomatic complexity，對軟體研發而言有兩個很重要的意義：一、就 software testing 而言，它表示至少需要多少個 test case，才能將 control flow graph 中的每一條 independent path，都走過一遍；二、如果其值過大，例如超過 20，則表示該 software module 的 decision logic 過於複雜，以及目前的軟體架構很可能有其不當之處；因此，應當再次檢視與分析整個 software architecture，以降低這個 software module 的 cyclomatic complexity。

當然，每樣東西都有其好壞兩面；因此，cyclomatic complexity 亦有其未盡完善之處；其中，最為人所詬病的缺點為：一、對於 non-nested control flow 與 nested control flow 皆給予相同的 complexity，但實際上 nested control flow 應當比 non-nested control flow 來得複雜；二、k-way switch statement 之 k 值通常很大，但大部分的 k-way switch statement 的 control flow 是很單純的；例如，只是單純地將 enum 轉成 string literal；因此，k-way switch statement 通常會使得 cyclomatic complexity 失真。針對第一點，筆者在稍後會提出一個 nested cyclomatic complexity 來做修正；至於第二點，則可視情況將k-way switch statement 的 complexity contribution 由計算式中刪除，以避免 cyclomatic complexity 失真。

Information Flow Complexity

所謂「Information Flow Complexity」，顧名思義，就是在分析某個 software module 的 data flow complexity；它所考慮的要素有：一、讀取了哪些 parameters；二、更改了哪些 parameters (updated by value 的不列入計算，只計算updated by reference)；三、呼叫了哪些functions (functions called)；四、被哪些 functions 所呼叫 (functions that call this function)；五、讀取了哪些 global variables；六、更改了哪些 global variables。待查妥所有的要素之值後，即可根據下列公式來計算 information flow complexity：

information flow complexity = length * (fan-in * fan-out)2

where length could be cyclomatic complexity or line of code

fan-in = functions called + parameter read + global variables read

fan-out = functions that call this function + parameter updated by reference + global variables updated

如果某個 software module 的information flow complexity 之值很高，則表示該 software module 有可能是一個負載很重的關鍵點；例如，處理過多種類的資料抑或分派與協調過多種類的功能等等。就這個 information flow complexity 而言，筆者在蒐集與閱讀相關資料的過程中，並沒有看到任何的文獻提到，有什麼樣的典型參考值可以做為負載是否過重的判斷基準；所以，到底 information flow complexity 的值要大到多大，才算是負載過重，到目前為止似乎是沒有定論的；然而，即使沒有典型的參考值，工程師或專案團隊還是可以針對所有 software module 的 information flow complexity，求出它們的「中位數」與「四分位數」，來做為比較與判斷的基準。

( 創作｜其他 )