Skip to content

ds analysis with multiple conditions #60

@feanaros

Description

@feanaros

I have a dataset with multiple conditions and I would like to perform DS analysis but I am not sure I the data are prepared properly.

ei
   sample_id condition patient_id n_cells
1   HealthyA   Healthy        IDA   57406
2   HealthyB   Healthy        IDB   57360
3   HealthyE   Healthy        IDE  186564
4      NAFL1      NAFL        ID1  129166
5      NAFL2      NAFL        ID2   84568
6      NAFL3      NAFL        ID4  144629
7      NAFL4      NAFL        ID5  328842
8      NAFL5      NAFL       ID10  209022
9       NAS1       NAS        ID8   84714
10      NAS2       NAS        ID3  216991
11      NAS3       NAS        ID7   85073
12     NASH1      NASH        ID6   95879
13     NASH2      NASH       ID11   67581
14     NASH3      NASH       ID12   47626

> ds_formula1 <- createFormula(ei, cols_fixed = "condition")
> ds_formula1
$formula
y ~ condition
<environment: 0x7fbba1a1a2a8>

$data
   condition
1    Healthy
2    Healthy
3    Healthy
4       NAFL
5       NAFL
6       NAFL
7       NAFL
8       NAFL
9        NAS
10       NAS
11       NAS
12      NASH
13      NASH
14      NASH

$random_terms
[1] FALSE

> contrast <- createContrast(c(0, 1, 0, 0))
> contrast
     [,1]
[1,]    0
[2,]    1
[3,]    0
[4,]    0
> 
> ds_res4 <- diffcyt(
+   sce, 
+   formula = ds_formula1, 
+   contrast = contrast, 
+   analysis_type = "DS", 
+   method_DS = c("diffcyt-DS-LMM"),
+   clustering_to_use = "meta14", 
+   subsampling = 10000, 
+   verbose = TRUE
+ )
using SingleCellExperiment object from CATALYST as input
using cluster IDs from clustering stored in column 'meta14' of 'cluster_codes' data frame in 'metadata' of SingleCellExperiment object from CATALYST
calculating features...
calculating DS tests using method 'diffcyt-DS-LMM'...
There were 50 or more warnings (use warnings() to see the first 50)
> diffcyt::topTable(ds_res4, format_vals = TRUE, top_n = 1000, order_by = "p_adj")
DataFrame with 504 rows and 4 columns
    cluster_id   marker_id     p_val     p_adj
      <factor>    <factor> <numeric> <numeric>
6           6        CD69   0.002060     0.147
6           6        CXCR3  0.003700     0.147
9           9        CXCR3  0.002050     0.147
12          12       CXCR3  0.002580     0.147
3           3        FoxP3  0.000912     0.147
...        ...         ...       ...       ...
11          11 CD223_LAG-3        NA        NA
12          12 CD223_LAG-3        NA        NA
13          13 CD223_LAG-3        NA        NA
14          14 CD223_LAG-3        NA        NA
13          13 CD16               NA        NA

If I try with limma DS:

> ds_design <- createDesignMatrix(ei, cols_design = "condition")
> ds_formula1 <- createFormula(ei, cols_fixed = "condition")
> contrast <- createContrast(c(0, 1, 0, 0))
> ds_res3 <- diffcyt(
+   sce, 
+   design = ds_design, 
+   contrast = contrast, 
+   analysis_type = "DS", 
+   clustering_to_use = "meta14", 
+   subsampling = 10000, 
+   verbose = TRUE,
+   transform = F
+ )
using SingleCellExperiment object from CATALYST as input
using cluster IDs from clustering stored in column 'meta14' of 'cluster_codes' data frame in 'metadata' of SingleCellExperiment object from CATALYST
calculating features...
calculating DS tests using method 'diffcyt-DS-limma'...
Warning messages:
1: In fitFDist(var, df1 = df, covariate = covariate) :
  More than half of residual variances are exactly zero: eBayes unreliable
2: In splines::ns(covariate, df = splinedf, intercept = TRUE) :
  shoving 'interior' knots matching boundary knots to inside

> diffcyt::topTable(ds_res3, format_vals = TRUE, show_logFC = T, top_n = 1000, order_by = "marker_id")
DataFrame with 504 rows and 5 columns
    cluster_id marker_id     logFC     p_val     p_adj
      <factor>  <factor> <numeric> <numeric> <numeric>
1            1      CD45     0.057    0.7820     1.000
2            2      CD45     0.303    0.5810     1.000
3            3      CD45    -0.150    0.4660     1.000
4            4      CD45    -0.140    0.2630     1.000
5            5      CD45    -0.295    0.0535     0.655
...        ...       ...       ...       ...       ...
10          10      CD16     0.000     1.000         1
11          11      CD16     0.000     1.000         1
12          12      CD16    -0.023     0.235         1
13          13      CD16     0.000     1.000         1
14          14      CD16     0.000     1.000         1


QUESTION:

  1. IS IT OK THE CONTRAST? IF i CHANGE WITH -1, 1, 0, 0 RESULTS ARE SIGNIFICANTS.
  2. CAN YOU ME EXPLAIN THE WARNING MESSAGES?
  3. I FOLLOWED THE CATALYST TUTORIAL WITH TRASFORMATIONS USING COFACTOR = 5. DO i NEED TO SET TRANSFORM = F OR TRUE?
    THANK YOU

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions