网站建设如何跑单子,erp定制开发价格,永久免费的财务软件,wordpress统计访问ip最近给凯爹做的一个苦力活#xff0c;统计检验这个东西说实话也挺有趣#xff0c;跟算法设计一样#xff0c;好的检验真的是挺难设计的#xff0c;就有近似算法的那种感觉#xff0c;检验很难保证size和power都很理想#xff0c;所以就要做tradeoff#xff0c;感觉这个假…最近给凯爹做的一个苦力活统计检验这个东西说实话也挺有趣跟算法设计一样好的检验真的是挺难设计的就有近似算法的那种感觉检验很难保证size和power都很理想所以就要做tradeoff感觉这个假设检验的思路还是挺有趣所以破例记录一下。
今天阳历生日其实我每年都是过农历生日凯爹职场情场皆得意前脚拿到offer后脚抱得美人归说到底凯爹还是个挺励志的人二战时吃那么多苦如今终于是苦尽甘来这不狠宰他一手哈哈哈哈哈哈哈哈 文章目录假设检验① H0:A,BH_0:A,BH0:A,B是对角阵1 生成模拟数据XXX2 虽然生成模拟数据时已知A,BA,BA,B但假设A,BA,BA,B未知对其进行估计。2.1 第一种估计方法Naive2.2 第二种估计方法Sample2.3 第三种估计方法Banded只可对代表时间维度的矩阵BBB使用3 对A,BA,BA,B的估计值进行假设检验3.1 假设检验① H0:BH_0:BH0:B是对角阵3.2 类似地假设检验① H0:AH_0:AH0:A是对角阵3.3 检验协方差AAA哪些位置不为03.4 检验协方差BBB哪些位置不为04 上述1-3步骤每组参数设置{p,q,n,z}\{p,q,n,z\}{p,q,n,z}共8种组合重复1000次4.1 对于每组参数设置计算1000次试验后假设检验①的size4.2 对于每组参数设置计算1000次试验后假设检验①的powerAppendix 代码实现假设检验② H0:RA1RA2H_0:R_A^1R_A^2H0:RA1RA25 生成模拟数据XXX6 估计B~(g)\tilde B^{(g)}B~(g)的3种方法6.1 第一种估计方法Naive6.2 第二种估计方法Sample6.3 第三种估计方法Banded7 对两个协相关矩阵AAA的相关系数矩阵RA(g)R^{(g)}_ARA(g)进行假设检验7.1 假设检验② H0:RA1RA2H_0:R_A^1R_A^2H0:RA1RA2即第一组和第二组的相关系数矩阵是否相同7.2 检验相关系数矩阵RA1R_A^1RA1和RA2R_A^2RA2哪些位置不同8 上述5-7步骤每组参数设置{p,q,n,z}\{p,q,n,z\}{p,q,n,z}共8种组合重复1000次代码实现假设检验① H0:A,BH_0:A,BH0:A,B是对角阵
1 生成模拟数据XXX
对于matrix normal distributionMNpq(0,A,B)MN_{pq}(0,A,B)MNpq(0,A,B)000代表零均值A,BA,BA,B分别是行与列的协方差。从分布中抽取两组模拟数据X(1)(X1(1),...,Xn1(1)),X(2)(X1(2),...,Xn2(2))X^{(1)}(X_1^{(1)},...,X_{n1}^{(1)}),X^{(2)}(X_1^{(2)},...,X_{n2}^{(2)})X(1)(X1(1),...,Xn1(1)),X(2)(X1(2),...,Xn2(2))X1(1)X_1^{(1)}X1(1)维度为p×qp\times qp×q。两组数据的分布中AAA不一样BBB一样n1n2nn_1n_2nn1n2n
参数设置p{10,20},q{10,20},n{5,8}p\{10,20\},q\{10,20\},n\{5,8\}p{10,20},q{10,20},n{5,8}
矩阵Ap×pA_{p\times p}Ap×p {H0:IH1:IUδI1δ\left\{\begin{aligned} H_0:I\\ H_1:\frac{IU\delta I}{1\delta} \end{aligned}\right. ⎩⎨⎧H0:IH1:1δIUδI
矩阵Bq×q:bij0.4∣i−j∣,1≤i,j≤qB_{q\times q}:b_{ij}0.4^{|i-j|},1\le i,j\le qBq×q:bij0.4∣i−j∣,1≤i,j≤q
其中δ∣λmin(IU)∣0.05\delta|\lambda_{\min}(IU)|0.05δ∣λmin(IU)∣0.05λmin(IU)\lambda_{\min}(IU)λmin(IU)表示取矩阵IUIUIU的最小特征根的绝对值。UUU是稀疏对称矩阵有z{2}z\{2\}z{2}个非零元素。有z2\frac z22z个非零元素在下/上三角中不在对角线上服从U(2(logpnq)1/2,4(logpnq)1/2)U(2\left(\frac{\log p}{nq}\right)^{1/2},4\left(\frac{\log p}{nq}\right)^{1/2})U(2(nqlogp)1/2,4(nqlogp)1/2)均匀分布正负随机位置随机。 2 虽然生成模拟数据时已知A,BA,BA,B但假设A,BA,BA,B未知对其进行估计。
对于每种估计方法需要重复第3部分。
2.1 第一种估计方法Naive
直接代入真实的A,BA,BA,B
2.2 第二种估计方法Sample
Ap×pA_{p\times p}Ap×p的朴素估计A~∝1nq∑k1nXkXk⊤\tilde A\propto \frac{1}{nq}\sum_{k1}^nX_kX_k^\topA~∝nq1∑k1nXkXk⊤Bq×qB_{q\times q}Bq×q的朴素估计B~∝1np∑k1nXk⊤Xk\tilde B\propto \frac{1}{np}\sum_{k1}^nX_k^\top X_kB~∝np1∑k1nXk⊤Xk注意此处估计值差了常数倍不可直接调用。
当AAA已知时用A~\tilde AA~代替可以改进BBB的估计 B^1np∑k1nXk⊤(A~c)−1Xk\widehat B\frac1{np}\sum_{k1}^nX_k^\top\left(\frac{\tilde A}{c}\right)^{-1}X_k Bnp1k1∑nXk⊤(cA~)−1Xk
ccc是一个未知常数后续计算中会被抵消。
当BBB已知时用B~\tilde BB~代替可以改进AAA的估计 A^1nq∑k1nXk(A~c)−1Xk⊤\widehat A\frac1{nq}\sum_{k1}^nX_k\left(\frac{\tilde A}{c}\right)^{-1}X_k^\top Anq1k1∑nXk(cA~)−1Xk⊤
ccc是一个未知常数后续计算中会被抵消。
2.3 第三种估计方法Banded只可对代表时间维度的矩阵BBB使用
只保留B^\widehat BB对角线以及两侧副对角线上的值 Bˉ{b^i,jif ∣i−j∣≤20otherwise\bar B\left\{\begin{aligned}\hat b_{i,j}\text{if }|i-j|\le 2\\ 0\text{otherwise} \end{aligned}\right. Bˉ{b^i,j0if ∣i−j∣≤2otherwise 3 对A,BA,BA,B的估计值进行假设检验
3.1 假设检验① H0:BH_0:BH0:B是对角阵
记Mnmax1≤ij≤qMij,MijM_n\max_{1\le ij\le q}M_{ij},M_{ij}Mnmax1≤ij≤qMij,Mij为bijb_{ij}bij标准化后的值Mijb^ij2θ^ij/(np)M_{ij}\frac{\hat b_{ij}^2}{\hat \theta_{ij}/(np)}Mijθ^ij/(np)b^ij2此处常数ccc会相互抵消其中 θ^ij1np∑k1n∑l1p[(Xk⊤(A~c)−1/2)i,l(Xk⊤(A~c)−1/2)j,l−b^i,j]2\hat \theta_{ij}\frac1{np}\sum_{k1}^n\sum_{l1}^p\left[\left(X_k^\top\left(\frac{\tilde A}{c}\right)^{-1/2}\right)_{i,l}\left(X_k^\top\left(\frac{\tilde A}{c}\right)^{-1/2}\right)_{j,l}-\hat b_{i,j}\right]^2 θ^ijnp1k1∑nl1∑pXk⊤(cA~)−1/2i,lXk⊤(cA~)−1/2j,l−b^i,j2
由于Mn−4logploglogpM_n-4\log p\log\log pMn−4logploglogp服从Gumble分布设统计量ΦαI(Mn≥α4logq−loglogq)\Phi_\alphaI(M_n\geq_{\alpha}4\log q-\log\log q)ΦαI(Mn≥α4logq−loglogq)其中qα−log(8π)−2loglog(1−α)−1q_\alpha-\log(8\pi)-2\log\log(1-\alpha)^{-1}qα−log(8π)−2loglog(1−α)−1
当Φα1\Phi_\alpha1Φα1时拒绝BBB是对角阵的原假设。
3.2 类似地假设检验① H0:AH_0:AH0:A是对角阵
相当于转置X(1)X^{(1)}X(1)再重复3.1操作即对应ppp换成qqqXk⊤X_k^\topXk⊤换成XkX_kXkA~\tilde AA~换成B~\tilde BB~
3.3 检验协方差AAA哪些位置不为0
Ψ(A){(i,j):ai,j≠0,1≤ij≤p}Ψ(τ4){(i,j):Mi,j≥τp,1≤ij≤p}\Psi(A)\{(i,j):a_{i,j}\neq 0,1\le ij\le p\}\\ \Psi(\tau4)\{(i,j):M_{i,j}\ge\tau p,1\le ij\le p\} Ψ(A){(i,j):ai,j0,1≤ij≤p}Ψ(τ4){(i,j):Mi,j≥τp,1≤ij≤p}
3.4 检验协方差BBB哪些位置不为0
Ψ(B){(i,j):bi,j≠0,1≤ij≤q}Ψ(τ4){(i,j):Mi,j≥τq,1≤ij≤q}\Psi(B)\{(i,j):b_{i,j}\neq 0,1\le ij\le q\}\\ \Psi(\tau4)\{(i,j):M_{i,j}\ge\tau q,1\le ij\le q\} Ψ(B){(i,j):bi,j0,1≤ij≤q}Ψ(τ4){(i,j):Mi,j≥τq,1≤ij≤q} 4 上述1-3步骤每组参数设置{p,q,n,z}\{p,q,n,z\}{p,q,n,z}共8种组合重复1000次
参数设置p{10,20},q{10,20},n1n2n{5,8},z{2},α5%p\{10,20\},q\{10,20\},n_1n_2n\{5,8\},z\{2\},\alpha5\%p{10,20},q{10,20},n1n2n{5,8},z{2},α5%需要满足np≥q,nq≥pnp\ge q,nq\ge pnp≥q,nq≥p
论文参数设置p{50,200},q{50,200},n1n2n{10,50},z{8},α5%p\{50,200\},q\{50,200\},n_1n_2n\{10,50\},z\{8\},\alpha5\%p{50,200},q{50,200},n1n2n{10,50},z{8},α5%
4.1 对于每组参数设置计算1000次试验后假设检验①的size
SizeP(原假设为真拒绝原假设)P(犯第一类错误)即假设检验①种矩阵A0A_0A0被拒绝的概率好的方法需要将size控制在0.05以内。
4.2 对于每组参数设置计算1000次试验后假设检验①的power
PowerP(原假设为假拒绝原假设)即假设检验①种A1,BA_1,BA1,B被拒绝的概率。 Appendix 代码实现
这个仿真实现并不难当然最好是用matlab写这里给出numpy的示例代码。 矩阵正态分布随机变量的生成可以用scipy.stats里封装好的方法也可以用cholesky分解来做。 测试结果表明小规模数据上的size还行但是power明显不太好但是原论文的效果就很漂亮 但是这个量级的参数跑起来会特别慢。
# -*- coding: UTF-8 -*-
# author: caoyang
# email: caoyang163.sufe.edu.cnimport math
import time
import numpy as np
from scipy.linalg import sqrtm
from scipy import stats# Randomly generate matrix normal distributed matrix.
# M is a p-by-q matrix, U is a p-by-p matrix, and V is a q-by-q matrix.
def randomize_matrix_normal_variable(M, U, V):# X_rand np.random.randn(*M.shape)# P np.linalg.cholesky(U)# Q np.linalg.cholesky(V)# return M P X_rand Q.Treturn stats.matrix_normal(M, U, V).rvs()# Randomly generate matrix U
def randomize_matrix_u(p, q, n, z2):temp np.sqrt((np.log(p) / n / q))low 2 * temphigh 4 * tempU np.zeros((p, p)) # initializetotal_index [(i, j) for i in range(p) for j in range(p)]np.random.shuffle(total_index)upper_index []lower_index []i 0while len(upper_index) int(z / 2) and len(lower_index) int(z / 2):if total_index[i][0] total_index[i][1] and len(upper_index) int(z / 2):upper_index.append((total_index[i][0], total_index[i][1]))lower_index.append((total_index[i][1], total_index[i][0]))elif total_index[i][0] total_index[i][1] and len(lower_index) int(z / 2):lower_index.append((total_index[i][0], total_index[i][1]))upper_index.append((total_index[i][1], total_index[i][0]))i 1for upper_indice, lower_indice in zip(upper_index, lower_index):sign 2 * np.random.randint(0, 2) - 1 # random 1 and -1 for signvalue sign * np.random.uniform(low, high)U[upper_indice] valueU[lower_indice] valuereturn U# Randomly generate matrix A
def randomize_matrix_a(hypothesis, p, q, n, z2):if hypothesis 0:return np.eye(p)elif hypothesis 1:U randomize_matrix_u(p, q, n, z)delta abs(np.min(np.linalg.eigvals(np.eye(p) U))) .05return (np.eye(p) U delta * np.eye(p)) / (1 delta)assert False, fHypothesis should be 0 or 1 but got {hypothesis} !# Randomly generate matrix B
def randomize_matrix_b(q):# return np.eye(q)return np.array([[0.4 ** (abs(i - j)) for j in range(q)] for i in range(q)])# Calculate tilde A and tilde B
def calc_tilde_A_and_B(p, q, n, sample):tilde_A np.zeros((p, p))for X in sample:tilde_A X X.Ttilde_A / (n * q)tilde_B np.zeros((q, q))for X in sample:tilde_B X.T Xtilde_B / (n * p)return tilde_A, tilde_B# Method 1
def estimate_method_1(A, B):hat_A Ahat_B Breturn hat_A, hat_B# Method 2
def estimate_method_2(p, q, n, sample):tilde_A, tilde_B calc_tilde_A_and_B(p, q, n, sample)hat_A np.zeros((p, p))for X in sample:hat_A X np.linalg.inv(tilde_B) X.That_A / (n * q)hat_B np.zeros((q, q))for X in sample:hat_B X.T np.linalg.inv(tilde_A) Xhat_B / (n * q)return hat_A, hat_B# Method 3
def estimate_method_3(p, q, n, sample):hat_A, hat_B estimate_method_2(p, q, n, sample)for i in range(p):for j in range(p):if abs(i - j) 2:hat_B[i, j] 0return hat_A, hat_B# Hypothesis 1: B is diagonal matrix
def test_B(p, q, n, sample, tilde_A, hat_B, alpha.05, tau4):hat_theta np.zeros((q, q))tilde_A_inv sqrtm(np.linalg.inv(tilde_A))for i in range(q):for j in range(q):res 0for k in range(n):X_k_tilde_A_inv sample[k].T tilde_A_invfor l in range(p):res (X_k_tilde_A_inv[i, l] * X_k_tilde_A_inv[j, l] - hat_B[i, j]) ** 2hat_theta[i, j] reshat_theta / (n * p)M n * p * hat_B * hat_B / (hat_theta)for i in range(q):for j in range(i 1):M[i, i] -999M_n np.max(M)q_alpha -np.log(8 * math.pi) - 2 * np.log(-np.log(1 - alpha))Phi_alpha 1 * (M_n q_alpha 4 * np.log(q) - np.log(np.log(q)))return Phi_alpha, M# Hypothesis 2: A is diagonal matrix
def test_A(p, q, n, sample, tilde_B, hat_A, alpha.05, tau4):sample list(map(lambda X: X.T, sample))return test_B(q, p, n, sample, tilde_B, hat_A, alpha)def run():p_choices [10, 20]q_choices [10, 20]n_choices [5, 8]# p_choices [50, 200]# q_choices [50, 200]# n_choices [10, 50]N 1000alpha .05z 2tau 4time_string time.strftime(%Y%m%d%H%M%S)filename fres1-{time_string}.txtwith open(filename, w) as f:passfor p in p_choices:for q in q_choices:for n in n_choices:print(fp {p}, q {q}, n {n})count_Phi_alpha_B_0 0count_Phi_alpha_A_0 0count_Phi_alpha_B_1 0count_Phi_alpha_A_1 0for _ in range(N):A_0 randomize_matrix_a(0, p, q, n, z)A_1 randomize_matrix_a(1, p, q, n, z)B randomize_matrix_b(q)sample_0 [randomize_matrix_normal_variable(np.zeros((p, q)), A_0, B) for _ in range(n)]sample_1 [randomize_matrix_normal_variable(np.zeros((p, q)), A_1, B) for _ in range(n)]tilde_A_0, tilde_B_0 calc_tilde_A_and_B(p, q, n, sample_0)tilde_A_1, tilde_B_1 calc_tilde_A_and_B(p, q, n, sample_0)hat_A_0, hat_B_0 estimate_method_2(p, q, n, sample_0)hat_A_1, hat_B_1 estimate_method_2(p, q, n, sample_1)# hat_A_0, hat_B_0 estimate_method_1(A_0, B)# hat_A_1, hat_B_1 estimate_method_1(A_1, B)Phi_alpha_B_0, M_B0 test_B(p, q, n, sample_0, tilde_A_0, hat_B_0, alpha, tau)Phi_alpha_A_0, M_A0 test_A(p, q, n, sample_0, tilde_B_0, hat_A_0, alpha, tau)Phi_alpha_B_1, M_B1 test_B(p, q, n, sample_1, tilde_A_1, hat_B_1, alpha, tau)Phi_alpha_A_1, M_A1 test_A(p, q, n, sample_1, tilde_B_1, hat_A_1, alpha, tau)count_Phi_alpha_B_0 Phi_alpha_B_0count_Phi_alpha_A_0 Phi_alpha_A_0count_Phi_alpha_B_1 Phi_alpha_B_1count_Phi_alpha_A_1 Phi_alpha_A_1###################################### 3.3 3.4Psi_B0 (B ! 0) * 1 # 得到零一矩阵可用于画热力图Psi_tau_B0 (M_B0 tau * p) * 1 # 得到零一矩阵可用于画热力图where_B0 np.where(Psi_B0 1)print([(x, y) for (x, y) in zip(where_B0[0], where_B0[1])]) # 得到零一矩阵中元素1的坐标where_tau_B0 np.where(Psi_tau_B0 1)print([(x, y) for (x, y) in zip(where_tau_B0[0], where_tau_B0[1])]) # 得到零一矩阵中元素1的坐标# ----------------------------------Psi_A0 (A_0 ! 0) * 1 # 得到零一矩阵可用于画热力图Psi_tau_A0 (M_A0 tau * q) * 1 # 得到零一矩阵可用于画热力图where_A0 np.where(Psi_A0 1)print([(x, y) for (x, y) in zip(where_A0[0], where_A0[1])]) # 得到零一矩阵中元素1的坐标where_tau_A0 np.where(Psi_tau_A0 1)print([(x, y) for (x, y) in zip(where_tau_A0[0], where_tau_A0[1])]) # 得到零一矩阵中元素1的坐标# ----------------------------------# 以下类似Psi_B1 (B ! 0) * 1Psi_tau_B1 (M_B1 tau * p) * 1# ----------------------------------Psi_A1 (A_1 ! 0) * 1Psi_tau_A1 (M_A1 tau * q) * 1#####################################print(Phi_alpha_B_0: , count_Phi_alpha_B_0)print(Phi_alpha_A_0: , count_Phi_alpha_A_0)print(Phi_alpha_B_1: , count_Phi_alpha_B_1)print(Phi_alpha_A_1: , count_Phi_alpha_A_1)with open(filename, a) as f:f.write(fPhi_alpha_B_0: {count_Phi_alpha_B_0}\n)f.write(fPhi_alpha_A_0: {count_Phi_alpha_A_0}\n) f.write(fPhi_alpha_B_1: {count_Phi_alpha_B_1}\n) f.write(fPhi_alpha_A_1: {count_Phi_alpha_A_1}\n)
run()假设检验② H0:RA1RA2H_0:R_A^1R_A^2H0:RA1RA2
5 生成模拟数据XXX
数据从matrix normal distributionMNpq(0,A(g),B(g))MN_{pq}(0,A^{(g)},B^{(g)})MNpq(0,A(g),B(g))中生成。B(g)B^{(g)}B(g)为AR(1)AR(1)AR(1)过程的自相关系数矩阵系数为0.8和0.9.此处简化使用协方差矩阵Bq×qg:bij0.4∣i−j∣,1≤i,j≤qB^{g}_{q\times q}:b_{ij}0.4^{|i-j|},1\le i,j\le qBq×qg:bij0.4∣i−j∣,1≤i,j≤q的相关系数矩阵同**1 生成模拟数据XXX**中一样。
H0H_0H0下A(1)A(2)Σ(1)D1/2Σ∗(1)D1/2A^{(1)}A^{(2)}\Sigma^{(1)}D^{1/2}{\Sigma^*}^{(1)}D^{1/2}A(1)A(2)Σ(1)D1/2Σ∗(1)D1/2其中
Σ∗(1)(σi,j∗(1)){1if ij0.5if 5(k−1)1≤i≠j≤5kwith k1,...,p/50otherwise{\Sigma^*}^{(1)}(\sigma_{i,j}^{*(1)})\left\{\begin{aligned} 1\text{if }ij\\ 0.5\text{if }5(k-1)1\le i\neq j\le 5k\text{ with }k1,...,p/5\\ 0\text{otherwise} \end{aligned}\right. Σ∗(1)(σi,j∗(1))⎩⎨⎧10.50if ijif 5(k−1)1≤ij≤5k with k1,...,p/5otherwise
即非零值集中在对角线附近DDD为对焦矩阵di,iUnif(0.5,2.5)d_{i,i}Unif(0.5,2.5)di,iUnif(0.5,2.5)
H1H_1H1下(A(1))−1(Σ(1)δI)/(1δ)(A^{(1)})^{-1}(\Sigma^{(1)}\delta I)/(1\delta)(A(1))−1(Σ(1)δI)/(1δ)(A(2))−1(Σ(1)UδI)/(1δ)(A^{(2)})^{-1}(\Sigma^{(1)}U\delta I)/(1\delta)(A(2))−1(Σ(1)UδI)/(1δ)两者相差一个稀疏矩阵UUU其中δ∣min{λmin(Σ(1)),λmin(Σ(1)U)}∣0.05\delta|\min\{\lambda_{\min}(\Sigma^{(1)}),\lambda_{\min}(\Sigma^{(1)}U)\}|0.05δ∣min{λmin(Σ(1)),λmin(Σ(1)U)}∣0.05。UUU是稀疏对称矩阵有z2z2z2个非零元素有z/2z/2z/2个非零元素在下/上三角中不在对角线上服从U(3(logpnq)1/2,5(logpnq)1/2)U(3\left(\frac{\log p}{nq}\right)^{1/2},5\left(\frac{\log p}{nq}\right)^{1/2})U(3(nqlogp)1/2,5(nqlogp)1/2)的均匀分布正负随机位置随机。 6 估计B~(g)\tilde B^{(g)}B~(g)的3种方法
对于每种估计方法需要重复第7部分
6.1 第一种估计方法Naive
直接代入真实的B(g)B^{(g)}B(g)
6.2 第二种估计方法Sample
B~(g)1ngp∑k1ng(Xk(g))⊤Xk(g)\tilde B^{(g)}\frac{1}{n_gp}\sum_{k1}^{n_g}(X_k^{(g)})^\top X_k^{(g)} B~(g)ngp1k1∑ng(Xk(g))⊤Xk(g)
6.3 第三种估计方法Banded
只保留B~(g)\tilde B^{(g)}B~(g)对角线以及两侧副对角线上的值 Bˉ(g){b~i,j(g)if ∣i−j∣≤20otherwise\bar B^{(g)}\left\{\begin{aligned}\tilde b_{i,j}^{(g)}\text{if }|i-j|\le 2\\0\text{otherwise}\end{aligned}\right. Bˉ(g){b~i,j(g)0if ∣i−j∣≤2otherwise 7 对两个协相关矩阵AAA的相关系数矩阵RA(g)R^{(g)}_ARA(g)进行假设检验
7.1 假设检验② H0:RA1RA2H_0:R_A^1R_A^2H0:RA1RA2即第一组和第二组的相关系数矩阵是否相同
公式太长直接截图了 7.2 检验相关系数矩阵RA1R_A^1RA1和RA2R_A^2RA2哪些位置不同
Ψ∗(RA1,RA2){(i,j):r^i,j(1)≠r^i,j(2),1≤ij≤p}Ψ∗(τ4){(i,j):Mi,j∗≥τlog(p),1≤ij≤p}\Psi^*(R_A^1,R_A^2)\{(i,j):\hat r_{i,j}^{(1)}\neq \hat r_{i,j}^{(2)},1\le ij\le p\}\\ \Psi^*(\tau4)\{(i,j):M_{i,j}^*\ge \tau \log (p),1\le i j \le p\} Ψ∗(RA1,RA2){(i,j):r^i,j(1)r^i,j(2),1≤ij≤p}Ψ∗(τ4){(i,j):Mi,j∗≥τlog(p),1≤ij≤p} 8 上述5-7步骤每组参数设置{p,q,n,z}\{p,q,n,z\}{p,q,n,z}共8种组合重复1000次 代码实现
跟第一个检验的代码有很大的共通处其实我看到第二个才知道这个检验在做什么事情大概是自相关时间序列上的协方差和相关系数检验
# -*- coding: UTF-8 -*-
# author: caoyang
# email: caoyang163.sufe.edu.cnimport math
import time
import numpy as np
from scipy.linalg import sqrtm
from scipy import stats# Randomly generate matrix normal distributed matrix.
# M is a p-by-q matrix, U is a p-by-p matrix, and V is a q-by-q matrix.
def randomize_matrix_normal_variable(M, U, V):return stats.matrix_normal(M, U, V).rvs()# Randomly generate matrix U
def randomize_matrix_u(p, q, n, z2):temp np.sqrt((np.log(p) / n / q))low 3 * temphigh 5 * tempU np.zeros((p, p)) # initializetotal_index [(i, j) for i in range(p) for j in range(p)]np.random.shuffle(total_index)upper_index []lower_index []i 0while len(upper_index) int(z / 2) and len(lower_index) int(z / 2):if total_index[i][0] total_index[i][1] and len(upper_index) int(z / 2):upper_index.append((total_index[i][0], total_index[i][1]))lower_index.append((total_index[i][1], total_index[i][0]))elif total_index[i][0] total_index[i][1] and len(lower_index) int(z / 2):lower_index.append((total_index[i][0], total_index[i][1]))upper_index.append((total_index[i][1], total_index[i][0]))i 1for upper_indice, lower_indice in zip(upper_index, lower_index):sign 2 * np.random.randint(0, 2) - 1 # random 1 and -1 for signvalue sign * np.random.uniform(low, high)U[upper_indice] valueU[lower_indice] valuereturn U# Randomly generate matrix A
def randomize_matrix_a(hypothesis, p, q, n, z2):Sigma_star np.zeros((p, p))def _check(_i, _j):if _i _j:return Falsefor _k in range(p // 5):if 5 * (_k - 1) _i 5 * _k and 5 * (_k - 1) _j 5 * _k:return Truereturn Falsefor i in range(p):for j in range(p):if i j:Sigma_star[i, j] 1elif _check(i, j):Sigma_star[i, j] .5else:Sigma_star[i, j] 0D np.diag(np.random.uniform(.5, 2.5, p))D_sqrt np.sqrt(D)Sigma D_sqrt Sigma_star D_sqrt if hypothesis 0:A_1 Sigma[:, :]A_2 Sigma[:, :]return A_1, A_2elif hypothesis 1:U randomize_matrix_u(p, q, n, z)delta abs(min(np.min(np.linalg.eigvals(Sigma U)),np.min(np.linalg.eigvals(Sigma)),)) .05A_1 np.linalg.inv((Sigma delta * np.eye(p)) / (1 delta))A_2 np.linalg.inv((Sigma U delta * np.eye(p)) / (1 delta))# A_1 (Sigma delta * np.eye(p)) / (1 delta)# A_2 (Sigma U delta * np.eye(p)) / (1 delta)return A_1, A_2assert False, fHypothesis should be 0 or 1 but got {hypothesis} !# Randomly generate matrix B
def randomize_matrix_b(q):# return np.eye(q)return np.array([[0.4 ** (abs(i - j)) for j in range(q)] for i in range(q)])# Calculate tilde B
def calc_tilde_B(p, q, n, sample):tilde_B np.zeros((q, q))for X in sample:tilde_B X.T Xtilde_B / (n * p)return tilde_B# Calculate hat A
def calc_hat_A(p, q, n, sample, tilde_B):hat_A np.zeros((p, p))tilde_B_inv np.linalg.inv(tilde_B)for X in sample:hat_A X tilde_B_inv X.That_A / (n * q)return hat_A# Calculate hat R
def calc_hat_R(p, q, n, hat_A):hat_R np.zeros((p, p))for i in range(p):for j in range(p):hat_R[i, j] hat_A[i, j] / np.sqrt(hat_A[i, i] * hat_A[j, j])return hat_R# Method 1
def estimate_method_1(B):hat_B Breturn hat_B# Method 2
def estimate_method_2(p, q, n, sample):tilde_B calc_tilde_B(p, q, n, sample)return tilde_B# Method 3
def estimate_method_3(p, q, n, sample):bar_B calc_tilde_B(p, q, n, sample)for i in range(p):for j in range(p):if abs(i - j) 2:bar_B[i, j] 0return bar_B# Hypothesis: R_A^1 R_A^2
def test(p, q, n, A_1, A_2, B, sample_1, sample_2, alpha0.05, tau4):tilde_B_1 calc_tilde_B(p, q, n, sample_1)tilde_B_2 calc_tilde_B(p, q, n, sample_2)# 上面是用的estimate_method_2, 可以用另外两种# tilde_B_1 estimate_method_1(B)# tilde_B_2 estimate_method_1(B)# tilde_B_1 estimate_method_2(p, q, n, sample_1)# tilde_B_2 estimate_method_2(p, q, n, sample_2)# tilde_B_1 estimate_method_3(p, q, n, sample_1)# tilde_B_2 estimate_method_3(p, q, n, sample_2)hat_A_1 calc_hat_A(p, q, n, sample_1, tilde_B_1)hat_A_2 calc_hat_A(p, q, n, sample_2, tilde_B_2)hat_R_1 calc_hat_R(p, q, n, hat_A_1)hat_R_2 calc_hat_R(p, q, n, hat_A_2)tilde_B_1_inv sqrtm(np.linalg.inv(tilde_B_1))tilde_B_2_inv sqrtm(np.linalg.inv(tilde_B_2))M np.zeros((p, p))for i in range(p):for j in range(p):theta_1 0theta_2 0for k in range(n):X_k_tilde_B_1_inv sample_1[k] tilde_B_1_invX_k_tilde_B_2_inv sample_2[k] tilde_B_2_invfor l in range(q):theta_1 (X_k_tilde_B_1_inv[i, l] * X_k_tilde_B_1_inv[j, l] - hat_A_1[i, j]) ** 2theta_2 (X_k_tilde_B_2_inv[i, l] * X_k_tilde_B_2_inv[j, l] - hat_A_2[i, j]) ** 2theta_1 / (n * q)theta_2 / (n * q)pretty_theta_1 theta_1 / hat_A_1[i, i] / hat_A_1[j, j]pretty_theta_2 theta_2 / hat_A_2[i, i] / hat_A_2[j, j]M[i, j] ((hat_R_1[i, j] - hat_R_2[i, j]) ** 2) / (pretty_theta_1 / n / q pretty_theta_2 / n / q)for i in range(p):for j in range(i 1):M[i, i] -999M_n np.max(M)q_alpha -np.log(8 * math.pi) - 2 * np.log(-np.log(1 - alpha))Phi_alpha 1 * (M_n q_alpha 4 * np.log(q) - np.log(np.log(q)))########################### 7.2# ------------------------Psi_star 1 * (hat_R_1 ! hat_R_2) # 得到零一矩阵可用于画热力图hat_Psi_star 1 * (M tau * np.log(p)) # 得到零一矩阵可用于画热力图where_Psi_star np.where(Psi_star 1) # print([(x, y) for (x, y) in zip(where_Psi_star[0], where_Psi_star[1])]) # 得到零一矩阵中元素1的坐标where_hat_Psi_star np.where(hat_Psi_star 1)# print([(x, y) for (x, y) in zip(where_hat_Psi_star[0], where_hat_Psi_star[1])]) # 得到零一矩阵中元素1的坐标##########################return Phi_alpha, Psi_star, hat_Psi_stardef run():p_choices [10, 20]q_choices [10, 20]n_choices [5, 8]z 2alpha .05N 1000tau 4time_string time.strftime(%Y%m%d%H%M%S)filename fres2-{time_string}.txtwith open(filename, w) as f:passfor p in p_choices:for q in q_choices:for n in n_choices:print(fp {p}, q {q}, n {n})count_Phi_alpha_0 0count_Phi_alpha_1 0for _ in range(N):A_01, A_02 randomize_matrix_a(0, p, q, n, z)A_11, A_12 randomize_matrix_a(1, p, q, n, z)B randomize_matrix_b(q)sample_01 [randomize_matrix_normal_variable(np.zeros((p, q)), A_01, B) for _ in range(n)]sample_02 [randomize_matrix_normal_variable(np.zeros((p, q)), A_02, B) for _ in range(n)]sample_11 [randomize_matrix_normal_variable(np.zeros((p, q)), A_11, B) for _ in range(n)]sample_12 [randomize_matrix_normal_variable(np.zeros((p, q)), A_12, B) for _ in range(n)]# 7.2的Psi在这里返回Phi_alpha_0, Psi_star_0, hat_Psi_star_0 test(p, q, n, A_01, A_02, B, sample_01, sample_02, alpha, tau)Phi_alpha_1, Psi_star_1, hat_Psi_star_1 test(p, q, n, A_11, A_12, B, sample_11, sample_12, alpha, tau)count_Phi_alpha_0 Phi_alpha_0count_Phi_alpha_1 Phi_alpha_1print(Phi_alpha_0: , count_Phi_alpha_0)print(Phi_alpha_1: , count_Phi_alpha_1)with open(filename, a) as f:f.write(fPhi_alpha_0: {count_Phi_alpha_0}\n)f.write(fPhi_alpha_1: {count_Phi_alpha_1}\n)if __name__ __main__:run()