Analyzing data with a Chi-Square (r x c) Contingency table.

 

This test allows you to analyze categorical data that is arranged in tabular form (as follows – this data is entirely fabricated!). Use this example for a guide to the analysis, NOT IN YOUR PAPER OR PRESENTATION!!!!

 

  1. Place your data in a table similar to this one – you will need to modify the table based on your question.

 

Valine Codon

Housekeeping genes

Luxury genes

GTT

50

25

GTC

25

25

GTA

15

10

GTG

10

40

 

  1. In this example we are asking if there is an association between gene function (housekeeping vs. luxury genes) and codon usage.  To determine this, we test the null hypothesis

 

H0: The codon usage in housekeeping and luxury genes is equal or

there is no association between gene function and codon usage.

 

Against the alternative,

 

HA: The codon usage in housekeeping and luxury genes is not equal or

There is an association between gene function and codon usage.

 

  1. Using this information, go to the following website. 

 

http://www.physics.csbsju.edu/stats/contingency_NROW_NCOLUMN_form.html

           

In this case, this table has 4 rows and 2 columns. Type in your data on the next page where it gives you the correct set up for your Contingency Chi-Square.

 

This is what you should see when you run the above numbers:

r × c Contingency Table: Results

The results of a contingency table X2 statistical test performed at 13:37 on 17-NOV-2004

data: contingency table
 
       A      B
 
1     50     25     75
2     25     25     50
3     15     10     25
4     10     40     50
 
     100    100    200
 
 
 
expected: contingency table
 
        A          B
 
1    37.5       37.5    
2    25.0       25.0    
3    12.5       12.5    
4    25.0       25.0    
 
 

chi-square = 27.3
degrees of freedom = 3
probability = 0.000

 

  1. The first table is what you entered with row, column and absolute totals.
  2. The second table lists the expected values.
  3. The final output has the X2 value, degrees of freedom, and the actual p value. If p < .05 you should reject the null hypothesis of no association.
  4. In this case, codon usage is different in housekeeping and luxury genes.