Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Web Developer

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Your profile

    Search

    XML Feeds

    Google Ads

    « servicestack, restservice and easyhttpCopying Buckets With The Amazon S3 API »
    comments

    For the unfortunate souls not in the know (or is it the fortunate souls using one of the myriad alternatives?), SquishIt is a library used to optimize content delivery at runtime in ASP.net applications. It combines and minifies javascript files, and also does a bit of preprocessing for things like LESS and CoffeeScript. I've been working with Justin Etheredge (blog|twitter) on this for a while, first on patches for various bugs I encountered trying to use SquishIt on linux, but more recently my focus has been on improving extensibility. One of the areas that I really felt the library could benefit from increased extensibility is the area of CDN support. I've been doing a lot of work with Amazon CloudFront lately, and decided it would be cool to see how cleanly I could get SquishIt to work with the service.

    SquishIt CDN Support in Previous Versions

    I suppose it makes sense to start with what we already had in place. CDN support in SquishIt has been slowly progressing, largely thanks to community contributions. As of version 0.8.6 we did have support for injecting a base URL into asset paths, but this required you to know the generated file name in advance and upload it to your CDN through other means. While this worked, and could be easily automated in your build process, it wasn't exactly convenient. The pull requests we've gotten have done a fairly good job showing us what the community wants in terms of CDN support, so it feels like it is time to start trying to treat it as a first class citizen. The first thing I would like is a way to render the combined file directly to my CDN if it doesn't already exist, and maybe a way to force overwriting the file if need be.

    Adding Support for Custom Renderers

    For a while now, SquishIt has had an IRenderer interface. It has been there, but it has only been used internally to support rendering to the file system or to an in-memory cache. To get started, we needed to expose this interface publicly so that other assemblies could provide implementations. Once exposed, we need to enable consumers to supply their custom renderers to the SquishIt core somehow.

    SquishIt uses a fluent configuration syntax for setting up individual bundles, and that seemed as good a place to start as any. Typical usage looks something like this:

    1. Bundle.JavaScript()
    2.     .WithAttribute("attrName", "attrValue")
    3.     .Add("file1.js")
    4.     .Add("/otherscripts/file2.js")
    5.     .Render("combinedOutput.js");

    So the first thing that came to mind was to add something like a "WithFileRenderer" method. This would give us a way to inject a renderer into a bundle and have it used to render the combined files. However, we probably don't want to render to the CDN while debugging, so "WithReleaseFileRenderer" might be more appropriate. Setting up the method went something like this:

    1. IRenderer releaseFileRenderer;
    2.  
    3. public T WithReleaseFileRenderer(IRenderer renderer)
    4. {
    5.     this.releaseFileRenderer = renderer;
    6.     return (T)this;
    7. }

    We also want a way to configure this globally, to do that we needed to add a bit to our configuration class. This was basically the same thing:

    1. IRenderer _defaultReleaseRenderer;
    2. public Configuration UseReleaseRenderer(IRenderer releaseRenderer)
    3. {
    4.     _defaultReleaseRenderer = releaseRenderer;
    5.     return this;
    6. }

    Finally, we need to change the way the file renderer is obtained when we go to render the combined assets. Previously we were instantiating a new FileRenderer or CacheRenderer depending on circumstance, and passing that renderer into the main rendering method. This won't cut it anymore, as our needs have gotten significantly more complex. The constraints we have to deal with are as follows:

    • When debugging we should use a normal file renderer
    • We should favor a renderer configured at the instance level over one configured statically
    • If no instance or static renderer is configured we should use the old default behavior

    So the constructor calls for a new FileRenderer are replaced with calls to this method:

    1. protected IRenderer GetFileRenderer()
    2. {
    3.     return debugStatusReader.IsDebuggingEnabled() ? new FileRenderer(fileWriterFactory) :
    4.         bundleState.ReleaseFileRenderer ??
    5.         Configuration.Instance.DefaultReleaseRenderer() ??
    6.         new FileRenderer(fileWriterFactory);
    7. }

    This is basically all we needed to do to enable us to plug in a custom renderer to use in release mode. Now we can look at how we can make the CDN integration happen.

    Building the S3 Keys

    The only really tricky thing about building the renderer is that it takes a string representing the disk location to render to. Changing what the renderer takes as a parameter would involve a more significant change to the core behavior than I'm comfortable making right now, so the first thing we need is a way to turn these disk locations into keys that S3 can use. The key we create needs to match the relative path to that of the locally-rendered asset so that injecting the base url will yield the absolute path that we need.

    None of this is terribly difficult - the main edge cases we need to cover are

    • Root appearing twice in the file path (because of Windows' drive lettering this is mostly an issue running on unix-based systems)
    • Virtual directories

    To meet these requirements the two pieces of information that we need inside the key builder are the physical application path and virtual directory. Here are some tests:

    1. [Test]
    2. public void ReturnToRelative()
    3. {
    4.     var root = @"C:\fake\dir\";
    5.     var file = @"another\file.js";
    6.     var expected = @"another/file.js";
    7.  
    8.     var builder = new KeyBuilder(root, "");
    9.     Assert.AreEqual(expected, builder.GetKeyFor(root + file));
    10. }
    11.  
    12. [Test]
    13. public void ReturnToRelative_Injects_Virtual_Directory()
    14. {
    15.     var root = @"C:\fake\dir\";
    16.     var file = @"another\file.js";
    17.     var vdir = "/this";
    18.     var expected = @"this/another/file.js";
    19.  
    20.     var builder = new KeyBuilder(root, vdir);
    21.     Assert.AreEqual(expected, builder.GetKeyFor(root + file));
    22. }
    23.  
    24. [Test]
    25. public void ReturnToRelative_Only_Replaces_First_Occurrence_Of_Root()
    26. {
    27.     var root = @"test/";
    28.     var file = @"another/andthen/test/again.js";
    29.     var expected = @"another/andthen/test/again.js";
    30.  
    31.     var builder = new KeyBuilder(root, "");
    32.     Assert.AreEqual(expected, builder.GetKeyFor(root + file));
    33. }

    And the implementation for the KeyBuilder:

    1. internal class KeyBuilder : IKeyBuilder
    2. {
    3.     readonly string physicalApplicationPath;
    4.     readonly string virtualDirectory;
    5.  
    6.     public KeyBuilder(string physicalApplicationPath, string virtualDirectory)
    7.     {
    8.         this.physicalApplicationPath = physicalApplicationPath;
    9.         this.virtualDirectory = virtualDirectory;
    10.     }
    11.  
    12.     public string GetKeyFor(string path)
    13.     {
    14.         return RelativeFromAbsolutePath(path).TrimStart('/');
    15.     }
    16.  
    17.     string RelativeFromAbsolutePath(string path)
    18.     {
    19.         path = path.StartsWith(physicalApplicationPath)
    20.                         ? path.Substring(physicalApplicationPath.Length)
    21.                         : path;
    22.  
    23.         return virtualDirectory + "/" + path.Replace(@"\", "/").TrimStart('/');
    24.     }
    25. }

    Now that we have a means to build keys we can look at implementing the S3 Renderer.

    S3 Renderer Implementation

    The interface for renderers is very simple.

    1. public interface IRenderer
    2. {
    3.     void Render(string content, string outputPath);
    4. }

    The only things we'll need to implement this method are an initialized S3 client, a bucket and the key builder we implemented in the last section. By default, we won't want to upload our content if it already exists on the CDN, so we will need to check for existence before uploading the content. This can be done by querying for object metadata using the desired key - if the file doesn't exist we will get a "not found" status on the exception thrown by the s3 client. So the most important test will look like this:

    1. [Test]
    2. public void Render_Uploads_If_File_Doesnt_Exist()
    3. {
    4.     var s3client = new Mock<AmazonS3>();
    5.     var keyBuilder = new Mock<IKeyBuilder>();
    6.  
    7.     var key = "key";
    8.     var bucket = "bucket";
    9.     var path = "path";
    10.     var content = "content";
    11.  
    12.     keyBuilder.Setup(kb => kb.GetKeyFor(path)).Returns(key);
    13.  
    14.     s3client.Setup(c => c.GetObjectMetadata(It.Is<GetObjectMetadataRequest>(gomr => gomr.BucketName == bucket && gomr.Key == key))).
    15.         Throws(new AmazonS3Exception("", HttpStatusCode.NotFound));
    16.  
    17.     using(var renderer = S3Renderer.Create(s3client.Object)
    18.                             .WithBucketName(bucket)
    19.                             .WithKeyBuilder(keyBuilder.Object))
    20.     {
    21.         renderer.Render(content, path);
    22.     }
    23.  
    24.     s3client.Verify(c => c.PutObject(It.Is<PutObjectRequest>(por => por.Key == key &&
    25.                                                                         por.BucketName == bucket &&
    26.                                                                         por.ContentBody == content &&
    27.                                                                         por.CannedACL == S3CannedACL.NoACL)));
    28. }

    Note that it is checking the PutObjectRequest to ensure that the ACL used is "NoACL". This is probably not an optimal default (most people will want the "PublicRead" ACL I imagine) but I decided to err on the side of caution and force people to opt-in to making their content publicly visible. The implementation for the render method looks something like this:

    1. public void Render(string content, string outputPath)
    2. {
    3.     if(string.IsNullOrEmpty(outputPath) || string.IsNullOrEmpty(content)) throw new InvalidOperationException("Can't render to S3 with missing key/content.");
    4.  
    5.     var key = keyBuilder.GetKeyFor(outputPath);
    6.     if(!FileExists(key))
    7.     {
    8.         UploadContent(key, content);
    9.     }
    10. }
    11.  
    12. void UploadContent(string key, string content)
    13. {
    14.     var request = new PutObjectRequest()
    15.         .WithBucketName(bucket)
    16.         .WithKey(key)
    17.         .WithCannedACL(cannedACL)
    18.         .WithContentBody(content);
    19.  
    20.     s3client.PutObject(request);
    21. }
    22.  
    23. bool FileExists(string key)
    24. {
    25.     try
    26.     {
    27.         var request = new GetObjectMetadataRequest()
    28.             .WithBucketName(bucket)
    29.             .WithKey(key);
    30.  
    31.         var response = s3client.GetObjectMetadata(request);
    32.  
    33.         return true;
    34.     }
    35.     catch(AmazonS3Exception ex)
    36.     {
    37.         if(ex.StatusCode == HttpStatusCode.NotFound)
    38.         {
    39.             return false;
    40.         }
    41.         throw;
    42.     }
    43. }

    It has gotten slightly more complex since then (I've added configurable headers and an option for forcing overwrite of the existing file) but the core logic remains the same. It's a fairly naive implementation but my experience with the Amazon services has been good enough so far that I haven't encountered a lot of the exceptions that I hope to add handling for in the future.

    Adding Invalidation

    At this point we should be able to render our content directly to S3, but this doesn't get us all the way to where we need to be. While hosting static content in S3 offers some advantages over hosting it locally and it can work as a CDN, using CloudFront to deliver your S3 content makes more sense if you really want to minimize latency. To make this work we'll just need to add an invaliator to the mix. A test for the core usage looks like this:

    1. [Test]
    2. public void Invalidate()
    3. {
    4.     var cloudfrontClient = new Mock<AmazonCloudFront>();
    5.  
    6.     var distributionId = Guid.NewGuid().ToString();
    7.     var bucket = Guid.NewGuid().ToString();
    8.     var distribution = bucket + ".s3.amazonaws.com";
    9.     var key = Guid.NewGuid().ToString();
    10.  
    11.     var listDistributionsResponse = new ListDistributionsResponse();
    12.     listDistributionsResponse.Distribution.Add(new CloudFrontDistribution
    13.     {
    14.         Id = distributionId,
    15.         DistributionConfig = new CloudFrontDistributionConfig
    16.         {
    17.             S3Origin = new S3Origin(distribution, null)
    18.         }
    19.     });
    20.  
    21.     cloudfrontClient.Setup(cfc => cfc.ListDistributions())
    22.         .Returns(listDistributionsResponse);
    23.  
    24.     var invalidator = new CloudFrontInvalidator(cloudfrontClient.Object);
    25.     invalidator.InvalidateObject(bucket, key);
    26.  
    27.     cloudfrontClient.Verify(cfc => cfc.PostInvalidation(It.Is<PostInvalidationRequest>(pir => pir.DistributionId == distributionId
    28.         && pir.InvalidationBatch.Paths.Count == 1
    29.         && pir.InvalidationBatch.Paths.First() == key)));
    30. }

    The implementation is pretty straightforward, and will look very familiar to anyone who read my post regarding copying buckets with the S3 API. There are only two changes, first that we only need to invalidate one object at a time, and second that we only want to query for the list of CloudFront distributions once. Code for the CloudFront invalidator looks like this:

    1. class CloudFrontInvalidator : IDisposable, IInvalidator
    2. {
    3.     const string amazonBucketUriSuffix = ".s3.amazonaws.com";
    4.     const string dateFormatWithMilliseconds = "yyyy-MM-dd hh:mm:ss.ff";
    5.     readonly AmazonCloudFront cloudFrontClient;
    6.  
    7.     public CloudFrontInvalidator(AmazonCloudFront cloudFrontClient)
    8.     {
    9.         this.cloudFrontClient = cloudFrontClient;
    10.     }
    11.  
    12.     public void InvalidateObject(string bucket, string key)
    13.     {
    14.         var distId = GetDistributionIdFor(bucket);
    15.         if(!string.IsNullOrWhiteSpace(distId))
    16.         {
    17.             var invalidationRequest = new PostInvalidationRequest()
    18.                 .WithDistribtionId(distId)
    19.                 .WithInvalidationBatch(new InvalidationBatch(DateTime.Now.ToString(dateFormatWithMilliseconds), new List<string> { key }));
    20.  
    21.             cloudFrontClient.PostInvalidation(invalidationRequest);
    22.         }
    23.     }
    24.  
    25.     Dictionary<string, string> distributionNameAndIds;
    26.  
    27.     string GetDistributionIdFor(string bucketName)
    28.     {
    29.         distributionNameAndIds = distributionNameAndIds ??
    30.             cloudFrontClient.ListDistributions()
    31.             .Distribution
    32.             .ToDictionary(cfd =>
    33.                 cfd.DistributionConfig.S3Origin.DNSName.Replace(amazonBucketUriSuffix, ""),
    34.                 cfd => cfd.Id);
    35.  
    36.         string id = null;
    37.         distributionNameAndIds.TryGetValue(bucketName, out id);
    38.         return id;
    39.     }
    40.  
    41.     public void Dispose()
    42.     {
    43.         cloudFrontClient.Dispose();
    44.     }
    45. }

    As an interesting aside, while I was working on this Amazon released support for querystring invalidation/versioning, which is SquishIt's default behavior. I had planned to add a release note telling people that they would need to use squishit's "hash in filename" option, but it seems like now there won't be any need :)

    Neat, But How Do I Use This?

    It's nice that this all works on paper (and in unit tests) but how do we actually tie everything together? One of the key design decisions was that the renderer is instantiated with pre-initialized CloudFront and S3 clients. This way users aren't locked into a certain method of getting credentials or anything like that. To use the custom renderer for only a particular bundle usage would be something like this:

    1. var s3client = new AmazonS3Client("accessKey", "secretKey");
    2. var renderer = S3Renderer.Create(s3client)
    3.     .WithBucketName("bucket")
    4.     .WithDefaultKeyBuilder(HttpContext.Current.Request.PhysicalApplicationPath,
    5.                                     HttpContext.Current.Request.ApplicationPath)
    6.     .WithCannedAcl(S3CannedACL.PublicRead) as IRenderer;
    7.  
    8. Bundle.JavaScript()
    9.     .WithReleaseFileRenderer(renderer)
    10.     .WithOutputBaseHref("http://s3.amazonaws.com/bucket")
    11.     .Add("file1.js")
    12.     .Add("file2.js")
    13.     .Render("combined.js");

    This is nice, but I think the global configuration is probably what people will be using more often. As is common in ASP.net apps a lot of the setup magic for this happens in the app initialization. So you'd add something like this to your Application_Start method (in Global.asax.cs):

    1. var s3client = new AmazonS3Client("accessKey", "secretKey");
    2. var renderer = S3Renderer.Create(s3client)
    3.     .WithBucketName("bucket")
    4.     .WithDefaultKeyBuilder(HttpContext.Current.Request.PhysicalApplicationPath,
    5.                                     HttpContext.Current.Request.ApplicationPath)
    6.     .WithCannedAcl(S3CannedACL.PublicRead) as IRenderer;
    7.  
    8. Bundle.ConfigureDefaults()
    9.     .UseReleaseRenderer(renderer)
    10.     .UseDefaultOutputBaseHref("http://s3.amazonaws.com/bucket");

    I tried to make this something that could be run via WebActivator, but had trouble finding a method to use that would have access to the HttpContext (needed to resolve application path and virtual directory) so for now it needs to be set up manually. This may be for the best though, as it doesn't force any particular convention for access key / secret key retrieval. It doesn't feel like a ton of setup code to me, hopefully others will agree.

    What's Next

    Now that SquishIt 0.8.7 has been released I can finally start planning to make this available on NuGet as a standard package (currently in beta until I get a little more testing). It can be installed like any other, but will require updating SquishIt if you're using a pre-0.8.7 version. If you need to report any issues encountered while using the library, or feel like contributing some code, please do so on github. Oh, and if you just want to kick the tires on SquishIt without all this other nonsense, or try making it work with another CDN you can find the core library on NuGet as well.

    About the Author

    User bio imageAlex is a .net and SQL Server developer from southeastern PA, where he lives with a lovely wife and a veritable smorgasbord of pets. He recently completed a masters degree in Software Engineering from Penn State. He loves mountain biking, open source software, home brewing, Syracuse basketball, and the mono runtime.
    Social SitingsTwitterLinkedInHomePageLTD RSS Feed
    2030 views
    InstapaperVote on HN

    4 comments

    Comment from: Jason [Visitor] · http://irwinj.blogspot.com
    Jason Great post and great work - this is exactly what I was looking for. Thanks for the hard work.

    I have three questions if you wouldn't mind clarifying a few things:

    1. Do you have any suggestions for urls embedded in CSS files? For instance if I have a css file containing a background-url attribute that points to an 'images' folder on my web server and that file is bundled and the resulting file moved to the CDN the link to that background image is broken. I don't know if squishit supports that kind of scenario. The only thing i can think of is fully qualifying the url (pointing it to my CDN) - but that is something that I'd like to avoid as I would like to keep the files local and unbundled during development.
    2. Do you have any idea when this will exit the pre-release phase? I'm planning on starting to use it soon and was wondering if it is still a work in progress
    3. Can the UseDefaultOutputBaseHref be used without a specific protocol (i.e. using // instead of https://) or is there a reason why the protocol is required here. I'm just wondering.

    Again, thanks for the great work!
    Jason
    09/04/12 @ 22:46
    Comment from: Alex Ullrich [Member] Email
    Alex Ullrich 1. I would suggest using relative paths, and putting your images onto the CDN also, mimicking the folder structure you keep locally for development. Relative paths are resolved relative to the CSS file, so if all your content is hosted on the CDN you shouldn't have any problem.

    2. Honestly the code is pretty simple - the only thing I want to confirm is that I can get querystring hashes down in the renderer now that cloudfront supports querystring invalidation. If you use the "HashInFilename" option (by rendering to filename_#.css etc...) it should definitely be fine. I really just wanted some other people to test it before taking it out of pre-release, so if you don't mind being the guinea pig go for it. I will do my best to resolve any issues you encounter quickly (this is just coming back on my radar as SquishIt 0.9 settles in). I feel pretty confident we can have an official release ready in a week or two if you help with the testing.

    3. I've never used a protocol-relative URL for the default output href but don't see why it wouldn't work. If it doesn't work for you, post an issue to the SquishIt core on github, because it definitely should :)
    09/05/12 @ 06:28
    Comment from: Daz Bradbury [Visitor] · http://www.openrent.co.uk
    Daz Bradbury Absolutely awesome.

    I was hosting on Appharbor, and had a world of pain with RequestReduce, so switched to Squishit as the architecture is a lot more straight forward.

    Initially, there were a few problems, as with multiple workers, I had the choice of putting things in the cache, or creating each bundle on application startup such that each worker would know how to process it.

    Then I saw this article, and I have to say, it was a breeze to implement - I forked the repo thinking I would need to make some changes, but that worry was ill-placed.

    I made a few modificiations to your initial config which were needed, namely:

    a) When creating an s3client, if you're using a different region to the default US, you'll need to specify it as such (in this case, EUWest):

    var s3client = new AmazonS3Client("accessKey", "secretKey", Amazon.RegionEndpoint.EUWest1);

    Otherwise, Amazon will throw:

    “Maximum number of retry attempts reached”

    b) Part of the point in doing this, for me at least, was so static code is easily cacheable in production. As such, most users will probably want to add the following line to the renderer creation:

    .WithHeaders(Amazon.S3.Util.AmazonS3Util.CreateHeaderEntry("Cache-Control", "public, max-age=3153600"))

    Otherwise, this worked perfectly, injected my cloudfront url, and was a breeze!

    Thanks for sharing!

    11/22/12 @ 07:09
    Comment from: Alex Ullrich [Member] Email
    Alex Ullrich Glad to hear its working well for you! I didn't even think about the region thing, but the problem you ran into makes me happy that I didn't make any assumptions about how people are using the amazon services.

    Thanks for sharing your experience - this post has basically become the readme for the library so it will probably save someone from running into the same issue you encountered :)
    11/23/12 @ 18:51

    Leave a comment


    Your email address will not be revealed on this site.

    To mislead the spambots.

    Your URL will be displayed.
    (Line breaks become <br />)
    (Name, email & website)
    (Allow users to contact you through a message form (your email will not be revealed.)