Introduction
Amazon Simple Store Service (S3) is a service from Amazon that allows you to store files into reliable remote storage for a very competitive price; it is becoming very popular. S3 is used by companies to store photos and videos of their customers, back up their own data, and more. S3 provides both SOAP and REST APIs; this article focuses on using the S3 REST API with the Java programming language.
S3 Basics
S3 handles objects and buckets. An object matches to a stored file. Each object has an identifier, an owner, and permissions. Objects are stored in a bucket. A bucket has a unique name that must be compliant with internet domain naming rules. Once you have an AWS (Amazon Web Services) account, you can create up to 100 buckets associated with that account. An object is addressed by a URL, such as http://s3.amazonaws.com/bucketname/objectid. The object identifier is a filename or filename with relative path (e.g., myalbum/august/photo21.jpg). With this naming scheme, S3 storage can appear as a regular file system with folders and subfolders. Notice that the bucket name can also be the hostname in the URL, so your object could also be addressed by http://bucketname.s3.amazonaws.com/objectid.
S3 REST Security
S3 REST resources are secure. This is important not just for your own purposes, but also because customers are billed depending on how their S3 buckets and objects are used. An AWSSecretKey is assigned to each AWS customer, and this key is identified by an AWSAccessKeyID. The key must be kept secret and will be used to digitally sign REST requests. S3 security features are:
Authentication: Requests include AWSAccessKeyID
Authorization: Access Control List (ACL) could be applied to each resource
Integrity: Requests are digitally signed with AWSSecretKey
Confidentiality: S3 is available through both HTTP and HTTPS
Non repudiation: Requests are time stamped (with integrity, it's a proof of transaction)
The signing algorithm is HMAC/SHA1 (Hashing for Message Authentication with SHA1). Implementing a String signature in Java is done as follows:
private javax.crypto.spec.SecretKeySpec signingKey = null;
private javax.crypto.Mac mac = null;
...
// This method converts AWSSecretKey into crypto instance.
public void setKey(String AWSSecretKey) throws Exception
{
mac = Mac.getInstance("HmacSHA1");
byte[] keyBytes = AWSSecretKey.getBytes("UTF8");
signingKey = new SecretKeySpec(keyBytes, "HmacSHA1");
mac.init(signingKey);
}
// This method creates S3 signature for a given String.
public String sign(String data) throws Exception
{
// Signed String must be BASE64 encoded.
byte[] signBytes = mac.doFinal(data.getBytes("UTF8"));
String signature = encodeBase64(signBytes);
return signature;
}
...
Authentication and signature have to be passed into the Authorization HTTP header like this:
Authorization: AWS <AWSAccessKeyID>: <Signature>.
The signature must include the following information:
HTTP method name (PUT, GET, DELETE, etc.)
Content-MD5, if any
Content-Type, if any (e.g., text/plain)
Metadata headers, if any (e.g., "x-amz-acl" for ACL)
GMT timestamp of the request formatted as EEE, dd MMM yyyy HH:mm:ss
URI path such as /mybucket/myobjectid
Here is a sample of successful S3 REST request/response to create "onjava" bucket:
Request:
PUT /onjava HTTP/1.1
Content-Length: 0
User-Agent: jClientUpload
Host: s3.amazonaws.com
Date: Sun, 05 Aug 2007 15:33:59 GMT
Authorization: AWS 15B4D3461F177624206A:YFhSWKDg3qDnGbV7JCnkfdz/IHY=
Response:
HTTP/1.1 200 OK
x-amz-id-2: tILPE8NBqoQ2Xn9BaddGf/YlLCSiwrKP+OQOpbi5zazMQ3pC56KQgGk
x-amz-request-id: 676918167DFF7F8C
Date: Sun, 05 Aug 2007 15:30:28 GMT
Location: /onjava
Content-Length: 0
Server: AmazonS3
Notice the delay between request and response timestamp? The request Date has been issued after the response Date. This is because the response date is coming from the Amazon S3 server. If the difference from request to response timestamp is too high then a RequestTimeTooSkewed error is returned. This point is another important feature of S3 security; it isn't possible to roll your clock too far forward or back and make things appear to happen when they didn't.
Note: Thanks to ACL, an AWS user can grant read access to objects for anyone (anonymous). Then signing is not required and objects can be addressed (especially for download) with a browser. It means that S3 can also be used as hosting service to serve HTML pages, images, videos, applets; S3 even allows granting time-limited access to objects.
Creating a Bucket
The code below details the Java implementation of "onjava" S3 bucket creation. It relies on packages java.net for HTTP, java.text for date formatting and java.util for time stamping. All these packages are included in J2SE; no external library is needed to talk to the S3 REST interface. First, it generates the String to sign, then it instantiates the HTTP REST connection with the required headers. Finally, it issues the request to s3.amazonaws.com web server.
public void createBucket() throws Exception
{
// S3 timestamp pattern.
String fmt = "EEE, dd MMM yyyy HH:mm:ss ";
SimpleDateFormat df = new SimpleDateFormat(fmt, Locale.US);
df.setTimeZone(TimeZone.getTimeZone("GMT"));
// Data needed for signature
String method = "PUT";
String contentMD5 = "";
String contentType = "";
String date = df.format(new Date()) + "GMT";
String bucket = "/onjava";
// Generate signature
StringBuffer buf = new StringBuffer();
buf.append(method).append("\n");
buf.append(contentMD5).append("\n");
buf.append(contentType).append("\n");
buf.append(date).append("\n");
buf.append(bucket);
String signature = sign(buf.toString());
// Connection to s3.amazonaws.com
HttpURLConnection httpConn = null;
URL url = new URL("http","s3.amazonaws.com",80,bucket);
httpConn = (HttpURLConnection) url.openConnection();
httpConn.setDoInput(true);
httpConn.setDoOutput(true);
httpConn.setUseCaches(false);
httpConn.setDefaultUseCaches(false);
httpConn.setAllowUserInteraction(true);
httpConn.setRequestMethod(method);
httpConn.setRequestProperty("Date", date);
httpConn.setRequestProperty("Content-Length", "0");
String AWSAuth = "AWS " + keyId + ":" + signature;
httpConn.setRequestProperty("Authorization", AWSAuth);
// Send the HTTP PUT request.
int statusCode = httpConn.getResponseCode();
if ((statusCode/100) != 2)
{
// Deal with S3 error stream.
InputStream in = httpConn.getErrorStream();
String errorStr = getS3ErrorCode(in);
...
}
}
Dealing with REST Errors
Basically, all HTTP 2xx response status codes are success and others 3xx, 4xx, 5xx report some kind of error. Details of error message are available in the HTTP response body as an XML document. REST error responses are defined in S3 developer guide. For instance, an attempt to create a bucket that already exists will return:
HTTP/1.1 409 Conflict
x-amz-request-id: 64202856E5A76A9D
x-amz-id-2: cUKZpqUBR/RuwDVq+3vsO9mMNvdvlh+Xt1dEaW5MJZiL
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Sun, 05 Aug 2007 15:57:11 GMT
Server: AmazonS3
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>BucketAlreadyExists</Code>
<Message>The named bucket you tried to create already exists</Message>
<RequestId>64202856E5A76A9D</RequestId>
<BucketName>awsdownloads</BucketName>
<HostId>cUKZpqUBR/RuwDVq+3vsO9mMNvdvlh+Xt1dEaW5MJZiL</HostId>
</Error>
Code is the interesting value in the XML document. Generally, this can be displayed as an error message to the end user. It can be extracted by parsing the XML stream with SAXParserFactory, SAXParser and DefaultHandler classes from org.xml.sax and javax.xml.parsers packages. Basically, you instantiate a SAX parser, then implement the S3ErrorHandler that will filter for Code tag when notified by the SAX parser. Finally, return the S3 error code as String:
public String getS3ErrorCode(InputStream doc) throws Exception
{
String code = null;
SAXParserFactory parserfactory = SAXParserFactory.newInstance();
parserfactory.setNamespaceAware(false);
parserfactory.setValidating(false);
SAXParser xmlparser = parserfactory.newSAXParser();
S3ErrorHandler handler = new S3ErrorHandler();
xmlparser.parse(doc, handler);
code = handler.getErrorCode();
return code;
}
// This inner class implements a SAX handler.
class S3ErrorHandler extends DefaultHandler
{
private StringBuffer code = new StringBuffer();
private boolean append = false;
public void startElement(String uri, String ln, String qn, Attributes atts)
{
if (qn.equalsIgnoreCase("Code")) append = true;
}
public void endElement(String url, String ln, String qn)
{
if (qn.equalsIgnoreCase("Code")) append = false;
}
public void characters(char[] ch, int s, int length)
{
if (append) code.append(new String(ch, s, length));
}
public String getErrorCode()
{
return code.toString();
}
}
A list of all error codes is provided in S3 developer guide. You're now able to create a bucket on Amazon S3 and deal with errors. Full source code is available in resources section.
File Uploading
Upload and download operations require more attention—S3 storage is unlimited, but it allows 5 GB transfer maximum per object. An optional content MD5 check is supported to make sure that transfer has not been corrupted, although an MD5 computation on a 5 GB file will take some time even on fast hardware.
S3 stores the uploaded object only if the transfer is successfully completed. If a network issue occurs then file has to be to uploaded again from the start. S3 doesn't support resuming or object content partial update. That's one of the limits of the first "S" (Simple) in S3, but the simplicity also makes dealing with the API much easier.
When performing a file transfer with S3, you will be responsible for streaming the objects. A good implementation will always stream objects, as otherwise they will grow in Java's heap; with S3's limit of 5 GB on an object, you could quickly be seeing an OutOfMemoryException.
An example of a good upload implementation is available in the resources section of this article.
Beyond This Example
Many other operations are available through the S3 APIs:
List buckets and objects
Delete buckets and objects
Upload and download objects
Add meta-data to objects
Apply permissions
Monitor traffic and get statistics (still a beta API)
Adding custom meta-data to an object is an interesting feature. For example, when uploading a video file, you could add "author," "title," and "location" properties, and retrieve them later when listing the objects. Getting statistics (IP address, referrer, bytes transferred, time to process, etc.) on buckets could be useful too to monitor traffic.
Conclusion
This article introduced the basics of Amazon Simple Store Service REST API. It detailed how to implement bucket creation in Java and how to deal with S3 security principles. It showed that HTTP and XML skills are needed when developing with S3 REST API. Some S3 operations could be improved (especially for upload), but overall Amazon S3 rocks. To go beyond what was presented in this article, you could check Java S3 tools available in the resources section.
References and Resources
Source code: Source code for this article
SOAP: Simple Object Access Protocol
REST: REpresentational State Transfer
S3 APIs: Amazon S3 Developer Guide
HMAC: Keyed-Hashing for Message Authentication (RFC 2104)
S3 forum: S3 forum for developers
S3 upload applet: A Java applet to upload files and folders to S3
Java S3 toolkit: An S3 toolkit for J2SE and J2ME provided by Amazon
Jets3t: Another Java toolkit for S3
分享到:
相关推荐
Amazon Simple Storage Service(Amazon S3)是一种面向 Internet 的存储服务。该服务旨在降低 网络规模计算的难度。 Amazon S3 提供一个简明的 Web 服务界面,用户可通过它随时在 Web 上的任何位置存储和检索任意 ...
AWS SQS 的开发指导
aws.s3:Amazon Simple Storage Service(S3)API客户端
用于Amazon Simple Storage Service(S3)的ACK服务控制器该存储库包含适用于Amazon Simple Storage Service(S3)的AWS Kubernetes控制器(ACK)服务控制器的源代码。 请在主要的AWS Controllers for Kubernetes ...
使用 Amazon Simple__Storage Service (S3) 作为__Enterprise Vault 的主存储__14.0 or later-25.pdf
亚马逊Amazon Simple Notification Service变得更简单
Over 30 hands-on recipes that will get you up and running with Amazon Simple Storage Service (S3) efficiently About This Book Learn how to store, manage, and access your data with AWS SDKs Study the ...
Amazon Simple Storage Service 文档。提供了 Amazon S3 的概念性介绍,以及使用各种功能的详细说明。
S3Lib 是 Amazon Simple Storage Service 的客户端库。 地位 Java 库 S3Lib-Java 已完成并在积极的生产中使用。 使用 S3Lib-Java 来满足其游戏的后端存储需求。 C 库 C 实现 S3Lib-C 正在进行中。 完成图书馆是在...
Amazon Simple Email Service提供了一种简单的方法来发送电子邮件,而无需维护您自己的邮件服务器。 这些PHP类使用该服务的基于REST的接口。 该存储库是Dan Myers开发的0.8.2版的一个分支。在阅读目录安装使用安装...
亚马逊新接口
A fast, powerful PHP toolkit for building web applications with Amazon Web Services. == LINKS == * HOME: http://tarzan-aws.com * DOCS: http://tarzan-aws.com/docs/ * WIKI: http://tarzan-aws.com/wiki/...
S3FS 使用进行存储的 Node.JS 实现。 主要维护者: 目的 S3FS 为 Node.JS 提供的文件系统 (FS) 实现提供了直接替代,允许 Node.JS 应用程序通过 Node.JS 使用的众所周知的使用分布式文件系统。 最低 IAM 政策 以下...
s3文件夹上传 一个小脚本,可通过使用官方的Amazon SDK将静态信息上传到S3存储桶。AWS凭证为了使用此模块,您需要具有AWS Credentials。 您可以通过两种方式加载它们: 通过直接传递给方法作为第二个参数。 通过使用...
s3-simple-blob-store Amazon S3从改编,以支持读取流上的字节范围请求。安装用npm安装$ npm install s3-simple-blob-store例子var aws = require ( 'aws-sdk' ) ;var s3blobs = require ( 's3-blob-store' ) ;var ...
如何使用gSOAP连接到Amazon S3以存储和检索数据
Amazon Simple Storage Service API Reference 2006-3-1
亚马逊规则的S3对象存储接口文档,目前官方只有英文版的,这个少有的中文版,你值得拥有。
bucketstore, 一个与 Amazon S3交互的简单 BucketStore: 一个简单的AmazonS3客户端,用于 python 。BucketStore 是一个非常简单的Amazon S3客户端,在。 它的目的是要比boto3更直接使用,并且专门在 Amazon S3中,...
这个新工具是基于Adobe Flex和AIR平台建立的,并且利用Amazon简单存储服务(Amazon Simple Storage Service,S3)对历史市场数据进行持久化。S3和AIR的组合的部署模型很强大,并且只需要很少的内部基础设施的支持。AIR...