当前位置: 首页 > news >正文

街道网站建设wordpress模版seo

街道网站建设,wordpress模版seo,南昌网站建设行情,直接进入qq的网址AWS Dynamodb简介 Amazon DynamoDB 是一种完全托管式、无服务器的 NoSQL 键值数据库#xff0c;旨在运行任何规模的高性能应用程序。DynamoDB能在任何规模下实现不到10毫秒级的一致响应#xff0c;并且它的存储空间无限#xff0c;可在任何规模提供可靠的性能。DynamoDB 提…AWS Dynamodb简介 Amazon DynamoDB 是一种完全托管式、无服务器的 NoSQL 键值数据库旨在运行任何规模的高性能应用程序。DynamoDB能在任何规模下实现不到10毫秒级的一致响应并且它的存储空间无限可在任何规模提供可靠的性能。DynamoDB 提供内置安全性、连续备份、自动多区域复制、内存缓存和数据导出工具。 Redshift简介 Amazon Redshift是一个快速、功能强大、完全托管的PB级别数据仓库服务。用户可以在刚开始使用几百GB的数据然后在后期扩容到PB级别的数据容量。Redshift是一种联机分析处理OLAPOnline Analytics Processing的类型支持复杂的分析操作侧重决策支持并且能提供直观易懂的查询结果。 资源准备 VPC vpc cird block 10.10.0.0/16internet gatewayelastic ip addressnat gateway使用elastic ip address作为public ippublic subnet 三个Availability Zoneprivate subnet 三个Availability Zonepublic route tablepublic subnet关联的route table destination 0.0.0.0/0 target: internet-gateway-id(允许与外界进行通信)destination10.10.0.0/16 local内部通信private route tableprivate subnet关联的route table destination10.10.0.0/16 local内部通信destination 0.0.0.0/0 target: nat-gateway-id(允许内部访问外界)web server security group 允许任意ip对443端口进行访问允许自己的ipdui22端口进行访问以便ssh到服务器上向数据库插入数据glue redshift connection security group 只包含一条self-referencing rule 允许同一个security group对所有tcp端口进行访创建Glue connection时需要使用该security groupReference glue connection security group must have a self-referencing rule to allow to allow AWS Glue components to communicate. Specifically, add or confirm that there is a rule of Type All TCP, Protocol is TCP, Port Range includes all ports, and whose Source is the same security group name as the Group ID.private redshift security group 允许vpc内部10.10.0.0/24对5439端口进行访问允许glue connection security group对5439端口进行访问public redshift security group 允许vpc内部10.10.0.0/24对5439端口进行访问允许kenisis firehose所在region的public ip 对5439端口进行访问 13.58.135.96/27 for US East (Ohio) 52.70.63.192/27 for US East (N. Virginia) 13.57.135.192/27 for US West (N. California) 52.89.255.224/27 for US West (Oregon) 18.253.138.96/27 for AWS GovCloud (US-East) 52.61.204.160/27 for AWS GovCloud (US-West) 35.183.92.128/27 for Canada (Central) 18.162.221.32/27 for Asia Pacific (Hong Kong) 13.232.67.32/27 for Asia Pacific (Mumbai) 13.209.1.64/27 for Asia Pacific (Seoul) 13.228.64.192/27 for Asia Pacific (Singapore) 13.210.67.224/27 for Asia Pacific (Sydney) 13.113.196.224/27 for Asia Pacific (Tokyo) 52.81.151.32/27 for China (Beijing) 161.189.23.64/27 for China (Ningxia) 35.158.127.160/27 for Europe (Frankfurt) 52.19.239.192/27 for Europe (Ireland) 18.130.1.96/27 for Europe (London) 35.180.1.96/27 for Europe (Paris) 13.53.63.224/27 for Europe (Stockholm) 15.185.91.0/27 for Middle East (Bahrain) 18.228.1.128/27 for South America (São Paulo) 15.161.135.128/27 for Europe (Milan) 13.244.121.224/27 for Africa (Cape Town) 13.208.177.192/27 for Asia Pacific (Osaka) 108.136.221.64/27 for Asia Pacific (Jakarta) 3.28.159.32/27 for Middle East (UAE) 18.100.71.96/27 for Europe (Spain) 16.62.183.32/27 for Europe (Zurich) 18.60.192.128/27 for Asia Pacific (Hyderabad) VPC全部资源的serverless文件 custom:bucketNamePrefix 替换为自己的创建的bucket service: dynamodb-to-redshift-vpccustom:bucketNamePrefix: jessicaprovider:name: awsregion: ${opt:region, ap-southeast-1}stackName: ${self:service}deploymentBucket:name: com.${self:custom.bucketNamePrefix}.deploy-bucketserverSideEncryption: AES256resources:Parameters:VpcName:Type: StringDefault: test-vpcResources:VPC:Type: AWS::EC2::VPCProperties:CidrBlock: 10.10.0.0/16EnableDnsSupport: trueEnableDnsHostnames: trueInstanceTenancy: defaultTags:- Key: NameValue: !Sub VPC_${VpcName}# Internet GatewayInternetGateway:Type: AWS::EC2::InternetGatewayProperties:Tags:- Key: NameValue: !Sub VPC_${VpcName}_InternetGatewayVPCGatewayAttachment:Type: AWS::EC2::VPCGatewayAttachmentProperties:VpcId: !Ref VPCInternetGatewayId: !Ref InternetGateway# web server security groupWebServerSecurityGroup:Type: AWS::EC2::SecurityGroupProperties:GroupDescription: Allow access from publicVpcId: !Ref VPCSecurityGroupIngress:- IpProtocol: tcpFromPort: 443ToPort: 443CidrIp: 0.0.0.0/0Tags:- Key: NameValue: !Sub VPC_${VpcName}_WebServerSecurityGroup# public route tableRouteTablePublic:Type: AWS::EC2::RouteTableProperties:VpcId: !Ref VPCTags:- Key: NameValue: !Sub VPC_${VpcName}_RouteTablePublicRouteTablePublicInternetRoute:Type: AWS::EC2::RouteDependsOn: VPCGatewayAttachmentProperties:RouteTableId: !Ref RouteTablePublicDestinationCidrBlock: 0.0.0.0/0GatewayId: !Ref InternetGateway# public subnetSubnetAPublic:Type: AWS::EC2::SubnetProperties:AvailabilityZone: !Select [0, !GetAZs ]CidrBlock: 10.10.0.0/24MapPublicIpOnLaunch: trueVpcId: !Ref VPCTags:- Key: NameValue: !Sub VPC_${VpcName}_SubnetAPublicRouteTableAssociationAPublic:Type: AWS::EC2::SubnetRouteTableAssociationProperties:SubnetId: !Ref SubnetAPublicRouteTableId: !Ref RouteTablePublicSubnetBPublic:Type: AWS::EC2::SubnetProperties:AvailabilityZone: !Select [1, !GetAZs ]CidrBlock: 10.10.32.0/24MapPublicIpOnLaunch: trueVpcId: !Ref VPCTags:- Key: NameValue: !Sub VPC_${VpcName}_SubnetBPublicRouteTableAssociationBPublic:Type: AWS::EC2::SubnetRouteTableAssociationProperties:SubnetId: !Ref SubnetBPublicRouteTableId: !Ref RouteTablePublicSubnetCPublic:Type: AWS::EC2::SubnetProperties:AvailabilityZone: !Select [2, !GetAZs ]CidrBlock: 10.10.64.0/24MapPublicIpOnLaunch: trueVpcId: !Ref VPCTags:- Key: NameValue: !Sub VPC_${VpcName}_SubnetCPublicRouteTableAssociationCPublic:Type: AWS::EC2::SubnetRouteTableAssociationProperties:SubnetId: !Ref SubnetCPublicRouteTableId: !Ref RouteTablePublic# redshift security groupPrivateRedshiftSecurityGroup:Type: AWS::EC2::SecurityGroupProperties:GroupDescription: Allow access from inside vpcVpcId: !Ref VPCSecurityGroupIngress:- IpProtocol: tcpFromPort: 5439ToPort: 5439CidrIp: 10.10.0.0/24- IpProtocol: tcpFromPort: 5439ToPort: 5439SourceSecurityGroupId: !GetAtt GlueRedshiftConnectionSecurityGroup.GroupIdTags:- Key: NameValue: !Sub VPC_${VpcName}_PrivateRedshiftSecurityGroup# redshift security groupPublicRedshiftSecurityGroup:Type: AWS::EC2::SecurityGroupProperties:GroupDescription: Allow access from inside vpc and Kinesis Data Firehose CIDR blockVpcId: !Ref VPCSecurityGroupIngress:- IpProtocol: tcpFromPort: 5439ToPort: 5439CidrIp: 10.10.0.0/24- IpProtocol: tcpFromPort: 5439ToPort: 5439CidrIp: 13.228.64.192/27Tags:- Key: NameValue: !Sub VPC_${VpcName}_PublicRedshiftSecurityGroupGlueRedshiftConnectionSecurityGroup:Type: AWS::EC2::SecurityGroupProperties:GroupDescription: Allow self referring for all tcp portsVpcId: !Ref VPCTags:- Key: NameValue: !Sub VPC_${VpcName}_GlueRedshiftConnectionSecurityGroupGlueRedshiftConnectionSecurityGroupSelfReferringInboundRule:Type: AWS::EC2::SecurityGroupIngressProperties:GroupId: !GetAtt GlueRedshiftConnectionSecurityGroup.GroupIdIpProtocol: tcpFromPort: 0ToPort: 65535SourceSecurityGroupId: !GetAtt GlueRedshiftConnectionSecurityGroup.GroupIdSourceSecurityGroupOwnerId: !Sub ${aws:accountId}# nat gatewayEIP:Type: AWS::EC2::EIPProperties:Domain: vpcNatGateway:Type: AWS::EC2::NatGatewayProperties:AllocationId: !GetAtt EIP.AllocationIdSubnetId: !Ref SubnetAPublic# private route tableRouteTablePrivate:Type: AWS::EC2::RouteTableProperties:VpcId: !Ref VPCTags:- Key: NameValue: !Sub VPC_${VpcName}_RouteTablePrivateRouteTablePrivateRoute:Type: AWS::EC2::RouteProperties:RouteTableId: !Ref RouteTablePrivateDestinationCidrBlock: 0.0.0.0/0NatGatewayId: !Ref NatGateway# private subnetSubnetAPrivate:Type: AWS::EC2::SubnetProperties:AvailabilityZone: !Select [0, !GetAZs ]CidrBlock: 10.10.16.0/24VpcId: !Ref VPCTags:- Key: NameValue: !Sub VPC_${VpcName}_SubnetAPrivateRouteTableAssociationAPrivate:Type: AWS::EC2::SubnetRouteTableAssociationProperties:SubnetId: !Ref SubnetAPrivateRouteTableId: !Ref RouteTablePrivateSubnetBPrivate:Type: AWS::EC2::SubnetProperties:AvailabilityZone: !Select [1, !GetAZs ]CidrBlock: 10.10.48.0/24VpcId: !Ref VPCTags:- Key: NameValue: !Sub VPC_${VpcName}_SubnetBPrivateRouteTableAssociationBPrivate:Type: AWS::EC2::SubnetRouteTableAssociationProperties:SubnetId: !Ref SubnetBPrivateRouteTableId: !Ref RouteTablePrivateSubnetCPrivate:Type: AWS::EC2::SubnetProperties:AvailabilityZone: !Select [2, !GetAZs ]CidrBlock: 10.10.80.0/24VpcId: !Ref VPCTags:- Key: NameValue: !Sub VPC_${VpcName}_SubnetCPrivateRouteTableAssociationCPrivate:Type: AWS::EC2::SubnetRouteTableAssociationProperties:SubnetId: !Ref SubnetCPrivateRouteTableId: !Ref RouteTablePrivateOutputs:VPC:Description: VPC.Value: !Ref VPCExport:Name: !Sub ${self:provider.stackName}SubnetsPublic:Description: Subnets public.Value:!Join [,,[!Ref SubnetAPublic, !Ref SubnetBPublic, !Ref SubnetCPublic],]Export:Name: !Sub ${self:provider.stackName}-PublicSubnetsSubnetsPrivate:Description: Subnets private.Value:!Join [,,[!Ref SubnetAPrivate, !Ref SubnetBPrivate, !Ref SubnetCPrivate],]Export:Name: !Sub ${self:provider.stackName}-PrivateSubnetsDefaultSecurityGroup:Description: VPC Default Security GroupValue: !GetAtt VPC.DefaultSecurityGroupExport:Name: !Sub ${self:provider.stackName}-DefaultSecurityGroupWebServerSecurityGroup:Description: VPC Web Server Security GroupValue: !Ref WebServerSecurityGroupExport:Name: !Sub ${self:provider.stackName}-WebServerSecurityGroupPrivateRedshiftSecurityGroup:Description: The id of the RedshiftSecurityGroupValue: !Ref PrivateRedshiftSecurityGroupExport:Name: !Sub ${self:provider.stackName}-PrivateRedshiftSecurityGroupPublicRedshiftSecurityGroup:Description: The id of the RedshiftSecurityGroupValue: !Ref PublicRedshiftSecurityGroupExport:Name: !Sub ${self:provider.stackName}-PublicRedshiftSecurityGroupGlueRedshiftConnectionSecurityGroup:Description: The id of the self referring security groupValue: !Ref GlueRedshiftConnectionSecurityGroupExport:Name: !Sub ${self:provider.stackName}-GlueSelfRefringSecurityGroupRedshift Cluster Private Cluster subnet group 创建一个包含private subnet的private subnet groupPrivate Cluster用于测试glue job同步数据到redshiftPubliclyAccessible必须设为false否则glue job无法连接 ClusterSubnetGroupName 使用private subnet groupVpcSecurityGroupIds 使用private redshift security groupNodeType: dc2.largeClusterType: single-node PubliclyAccessible: false Public Cluster subnet group 创建一个包含public subnet的public subnet groupPublic Cluster用于测试glue job同步数据到redshiftPubliclyAccessible必须设为true且security group允许kinesis firehose public ip对5439端口进行访问否则firehose无法连接到redshift ClusterSubnetGroupName 使用public subnet groupVpcSecurityGroupIds 使用public redshift security groupNodeType: dc2.largeClusterType: single-nodePubliclyAccessible: true redshift全部资源的serverless文件: custom:bucketNamePrefix 替换为自己的创建的bucket service: dynamodb-to-redshift-redshiftcustom:bucketNamePrefix: jessicaprovider:name: awsregion: ${opt:region, ap-southeast-1}stackName: ${self:service}deploymentBucket:name: com.${self:custom.bucketNamePrefix}.deploy-bucketserverSideEncryption: AES256resources:Parameters:ServiceName:Type: StringDefault: dynamodb-to-redshiftResources:PrivateClusterSubnetGroup:Type: AWS::Redshift::ClusterSubnetGroupProperties:Description: Private Cluster Subnet GroupSubnetIds:Fn::Split:- ,- Fn::ImportValue: !Sub ${ServiceName}-vpc-PrivateSubnetsTags:- Key: NameValue: private-subnetPrivateCluster:Type: AWS::Redshift::ClusterProperties:ClusterIdentifier: test-data-sync-redshiftClusterSubnetGroupName: !Ref ClusterSubnetGroupVpcSecurityGroupIds:- Fn::ImportValue: !Sub ${ServiceName}-vpc-PrivateRedshiftSecurityGroupDBName: devMasterUsername: adminMasterUserPassword: Redshift_admin_2022NodeType: dc2.largeClusterType: single-nodePubliclyAccessible: falsePublicClusterSubnetGroup:Type: AWS::Redshift::ClusterSubnetGroupProperties:Description: Public Cluster Subnet GroupSubnetIds:Fn::Split:- ,- Fn::ImportValue: !Sub ${ServiceName}-vpc-PublicSubnetsTags:- Key: NameValue: public-subnetPublicCluster:Type: AWS::Redshift::ClusterProperties:ClusterIdentifier: test-data-sync-redshift-publicClusterSubnetGroupName: !Ref PublicClusterSubnetGroupVpcSecurityGroupIds:- Fn::ImportValue: !Sub ${ServiceName}-vpc-PublicRedshiftSecurityGroupDBName: devMasterUsername: adminMasterUserPassword: Redshift_admin_2022NodeType: dc2.largeClusterType: single-nodePubliclyAccessible: trueOutputs:PrivateRedshiftEndpoint:Description: Redshift endpointValue: !GetAtt Cluster.Endpoint.AddressExport:Name: !Sub ${self:provider.stackName}-PrivateRedshiftEndpointPrivateRedshiftPort:Description: Redshift portValue: !GetAtt Cluster.Endpoint.PortExport:Name: !Sub ${self:provider.stackName}-PrivateRedshiftPortPublicRedshiftEndpoint:Description: Public Redshift endpointValue: !GetAtt PublicCluster.Endpoint.AddressExport:Name: !Sub ${self:provider.stackName}-PublicRedshiftEndpointPublicRedshiftPort:Description: Public Redshift portValue: !GetAtt PublicCluster.Endpoint.PortExport:Name: !Sub ${self:provider.stackName}-PublicRedshiftPort使用AWS Glue ETL Job进行同步 适用场景 一次性整表同步对于典型的时间序列数据当前的数据写入和读取频率高越老的数据读写频率越低通常会采用为每个时间段每天创建一张表的方式来合理的分配WCU和RCU。如果在当时时间段结束之后需要对该时间段内的所有数据进行复杂的分析操作则需要将dynamodb的整表同步到redshift 架构 优点 使用AWS Glue Crawler可以自动管理源表和目标表的scheme在Glue Job script中可以省去mapping的过程Glue Job script代码易维护 资源部署 Dynamodb table: 源数据表IAM role for glue crawlercrawler需要连接dynamodb和redshift的权限以读取表的schemeDynamodb glue catalog database用于存储crawler生成的dynamodb table schemeRedshift glue catalog database用于存储crawler生成的redshift table schemeDynamodb glue crawler用于读取dynamodb表生成对应的dynamodb table scheme​​​​​​​Redshift glue crawler:用于读取redshift表生成对应的redshift table schemeGlue connection:glue job连接redshift需要用到的connectionIAM role for glue jobGlue job需要S3 bucket for glue jobglue job 如何部署 sls deploy -c glue-etl.yml #replace ${bucketNamePrefix} to your own glue bucket name crate in glue-etl.yml aws s3 cp dynamodb-to-redshift.py s3://com.${bucketNamePrefix}.glue-temp-bucket/script/ 部署文件glue-etl.yml service: dynamodb-to-redshift-glue-etlcustom:bucketNamePrefix: jessicaprovider:name: awsregion: ${opt:region, ap-southeast-1}stackName: ${self:service}deploymentBucket:name: com.${self:custom.bucketNamePrefix}.deploy-bucketserverSideEncryption: AES256resources:Parameters:DynamodbTableName:Type: StringDefault: TestSyncToRedshiftServiceName:Type: StringDefault: dynamodb-to-redshiftGlueBucketName:Type: StringDefault: com.${self:custom.bucketNamePrefix}.glue-etl-temp-bucketResources:TestTable:Type: AWS::DynamoDB::TableProperties:TableName: !Sub ${DynamodbTableName}BillingMode: PAY_PER_REQUESTAttributeDefinitions:- AttributeName: pkAttributeType: S- AttributeName: skAttributeType: SKeySchema:- AttributeName: pkKeyType: HASH- AttributeName: skKeyType: RANGECrawlerRole:Type: AWS::IAM::RoleProperties:RoleName: CrawlerRoleAssumeRolePolicyDocument:Version: 2012-10-17Statement:- Effect: AllowPrincipal:Service:- glue.amazonaws.comAction:- sts:AssumeRoleManagedPolicyArns:- arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole- arn:aws:iam::aws:policy/AmazonDynamoDBFullAccess- arn:aws:iam::aws:policy/AmazonRedshiftFullAccess- arn:aws:iam::aws:policy/AmazonS3FullAccessDynamodbDatabase:Type: AWS::Glue::DatabaseProperties:CatalogId: !Ref AWS::AccountIdDatabaseInput:Name: dynamodb-databaseDynamodbCrawler:Type: AWS::Glue::CrawlerProperties:Name: dynamodb-crawlerConfiguration:Role: !GetAtt CrawlerRole.ArnDatabaseName: !Ref DynamodbDatabaseTargets:DynamoDBTargets:- Path: !Sub ${DynamodbTableName}SchemaChangePolicy:UpdateBehavior: UPDATE_IN_DATABASEDeleteBehavior: LOGSchedule:ScheduleExpression: cron(0/10 * * * ? *) # run every 10 minutesGlueRedshiftConnection:Type: AWS::Glue::ConnectionProperties:CatalogId: !Sub ${aws:accountId}ConnectionInput:Name: ${self:service}-redshift-connectionConnectionType: JDBCMatchCriteria: []PhysicalConnectionRequirements:SecurityGroupIdList:- Fn::ImportValue: !Sub ${ServiceName}-vpc-GlueSelfRefringSecurityGroupSubnetId:Fn::Select:- 1- Fn::Split:- ,- Fn::ImportValue: !Sub ${ServiceName}-vpc-PrivateSubnetsConnectionProperties:JDBC_CONNECTION_URL:Fn::Join:- - - jdbc:redshift://- Fn::ImportValue: !Sub ${ServiceName}-redshift-PrivateRedshiftEndpoint- :- Fn::ImportValue: !Sub ${ServiceName}-redshift-PrivateRedshiftPort- /devJDBC_ENFORCE_SSL: falseUSERNAME: adminPASSWORD: Redshift_admin_2022RedshiftDatabase:Type: AWS::Glue::DatabaseProperties:CatalogId: !Ref AWS::AccountIdDatabaseInput:Name: redshift-databaseRedshiftCrawler:Type: AWS::Glue::CrawlerProperties:Name: redshift-crawlerConfiguration:Role: !GetAtt CrawlerRole.ArnDatabaseName: !Ref RedshiftDatabaseTargets:JdbcTargets:- ConnectionName: !Ref GlueRedshiftConnectionPath: dev/public/test_sync_to_redshiftSchemaChangePolicy:UpdateBehavior: UPDATE_IN_DATABASEDeleteBehavior: LOGRedshiftGlueJobRole:Type: AWS::IAM::RoleProperties:RoleName: RedshiftGlueJobRoleAssumeRolePolicyDocument:Version: 2012-10-17Statement:- Effect: AllowPrincipal:Service:- glue.amazonaws.comAction: sts:AssumeRoleManagedPolicyArns:- arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole- arn:aws:iam::aws:policy/AmazonDynamoDBFullAccess- arn:aws:iam::aws:policy/AmazonRedshiftFullAccess- arn:aws:iam::aws:policy/AmazonS3FullAccess- arn:aws:iam::aws:policy/CloudWatchLogsFullAccessGlueTempBucket:Type: AWS::S3::BucketProperties:BucketName: !Sub ${GlueBucketName}GlueJob:Type: AWS::Glue::JobProperties:Name: dynamodb-to-redshift-glue-etl-jobRole: !GetAtt RedshiftGlueJobRole.ArnCommand:Name: glueetlScriptLocation: !Sub s3://${GlueBucketName}/script/dynamodb-to-redshift.pyPythonVersion: 3DefaultArguments:--TempDir: !Sub s3://${GlueBucketName}/tmp/dynamodb-to-redshift/WorkerType: G.1XNumberOfWorkers: 2GlueVersion: 3.0Connections:Connections:- !Ref GlueRedshiftConnectionglue job脚本dynamodb-to-redshift.py import sys from awsglue.transforms import * from awsglue.utils import getResolvedOptions from pyspark.context import SparkContext from awsglue.context import GlueContext from awsglue.job import Jobparams [JOB_NAME,TempDir, ]args getResolvedOptions(sys.argv, params) sc SparkContext() glueContext GlueContext(sc) spark glueContext.spark_session job Job(glueContext) job.init(args[JOB_NAME], args)DynamoDBtable_node1 glueContext.create_dynamic_frame.from_catalog(databasedynamodb-database,table_nametestsynctoredshift,transformation_ctxDynamoDBtable_node1, )RedshiftCluster_node2 glueContext.write_dynamic_frame.from_catalog(frameDynamoDBtable_node1,databaseredshift-database,table_namedev_public_test_sync_to_redshift,redshift_tmp_dirargs[TempDir],transformation_ctxRedshiftCluster_node2, )job.commit()测试 insert some data to dynamodb table with aws web console first, otherwise, the crawler can not detect the table scheme 2. run dynamodb-crawler, after run success, you can see the database and table in [glue console](https://ap-southeast-1.console.aws.amazon.com/glue/home?regionap-southeast-1#catalog:tabtables) 3. create redshift table with [Redshift query editor v2](https://ap-southeast-1.console.aws.amazon.com/sqlworkbench/home?regionap-southeast-1#/client) CREATE TABLE public.test_sync_to_redshift(pk varchar(200) not null, sk varchar(200) NOT NULL, primary key(pk, sk)); 4. run redshift-crawler, if encounter no valid connection error, please update password in the redshift-connection manually with aws console, dont know why the password is not correct when deploy with cloudformation. After run success, you can see the database and table in [glue console](https://ap-southeast-1.console.aws.amazon.com/glue/home?regionap-southeast-1#catalog:tabtables) 5. run glue etl job, after run success, you can check data in redshift table with [Redshift query editor v2](https://ap-southeast-1.console.aws.amazon.com/sqlworkbench/home?regionap-southeast-1#/client). This glue etl job will insert all data in dynamodb table to redshift table directly, as for redshift, [primary key, and foreign key constraints are informational only; they are not enforced by Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/dg/t_Defining_constraints.html), so if you run the jon serval times, you will see duplicate data with some primary key in redshift table. 使用AWS Glue Streaming Job进行同步 适用场景 持续增量同步表的操作支持插入更新和删除 架构图 资源部署 Dynamodb表VPCRedshift ClusterGlue CrawlerGlueJob 优点 可以支持表的插入更新和删除操作的同步 缺点 使用AWS kinesis Firehose进行同步 适用场景 持续增量同步表的操作只支持插入部分支持更新不支持删除比如记录传感器每秒收集的数据记录网站用户的行为事件 由于kinesis firehose是通过Redshift COPY命令与redshift进行集成的而redshift是不保证primary key的唯一性对redshift来说primary key只是提供信息并没有保证primary key的唯一性如果在COPY命令的源数据中包含多条primary key相同的数据比如对一条数据进行多次修改则会导致redshift表出现多条primary key相同的数据。部分支持更新的意思就是如果多条primary key相同的数据对你的业务逻辑没有影响那也可以使用AWS kinesis Firehose进行同步如果多条primary key对业务逻辑有影响那就不可以使用由于kinesis firehose是通过Redshift COPY命令与redshift进行集成的COPY命令是不支持删除的 架构图 资源部署 Reference Setting up networking for development for AWS Glue - AWS Glue
http://www.hkea.cn/news/14480032/

相关文章:

  • 办网站费用多少钱凤凰自媒体平台注册
  • wordpress做小说网站吗安卓上架app要多少钱
  • 广西南宁建设厅网站首页服务器绑定网站打不开
  • 网站统计代码添加三明seo优化
  • 宝安网站设计网站建设哪家快做网站困难吗
  • 怎样做网站营销php网站源代码修改
  • 网站手机版济宁高端网站建设
  • html5响应式网站建设平台wordpress解密主题
  • 北京大兴行业网站建设公司网站开发过什么软件
  • 怎么建设一个网站手动修改目录wordpress
  • 眼科医院网站开发网站管理入口
  • 朔州网站建设公司养老网站建设合同
  • 网站设计 成都设计建立企业网站最佳的公司
  • 深圳网站建设比较手机上免费制作ppt的软件
  • 重庆网站建设哪家好网站设计可以吗
  • 构建网站需要会什么浙江网缘电子商务有限公司
  • 扬州市住房和建设局网站canva可画在线设计平台
  • 视频网站空间制作网站代码
  • 单位门户网站可以做百度百科泉州网站建设服务
  • 未央网站建设洛阳建站
  • 淘宝购物券网站怎么做公司网站推广的方法
  • 杭州网站建设第一品牌网站设计文案
  • 网站忧化教程主机如何做网站空间
  • 做网站义乌北仑静态网站建设
  • 合肥公司门户网站制作wordpress 论坛 小程序
  • html 旅游网站庐山市星子网
  • 扁平化颜色网站查关键词排名网
  • asp程序网站后台发布产品的时候前台怎么不显示产品名称网站开发公司创业策划
  • flash 如何做游戏下载网站什么叫界面设计
  • 烟台小学网站建设商城建站