caching – How do I decide an initial in-memory cache size given my DB size and expected load throughput?

(Purely for learning purposes)

Say the DB contains 1 billion rows with 200 bytes per row = 200 GB of data.

The traffic at peak is 1000 requests/s, with each request asking for one DB row.

What cache size would I begin with to ease off the load on the DB? I realize that this is determined best empirically and can be tuned as time goes on.

Caches are usually not too large given memory constraint (unless you go for a distributed cache like redis), so we can’t have the in-memory cache be more than say 200 MB of space, which accounts for way less than 1% of the DB size and seems too small. The cache might just spend all its time being 100% occupied with 95% misses and evicting entries and caching new entries using a simple LRU scheme.

Perhaps there’s no point bothering to cache anything in-memory here. In that case, how would you go about coming up with an initial cache size in a redis cache?

Varnish isn’t caching any pages from Drupal

I have a Docksal stack that includes Varnish (with the acquia Docksal stack).

I made a static HTML page in the docroot, and Varnish is caching it, as shown by the HTTP headers:

$ curl -IL http://varnish.cx.docksal/cached.html
HTTP/1.1 200 OK
Server: openresty/1.15.8.2
Date: Mon, 22 Jun 2020 09:01:45 GMT
Content-Type: text/html
Content-Length: 0
Connection: keep-alive
X-Content-Type-Options: nosniff
Last-Modified: Thu, 18 Jun 2020 10:39:03 GMT
ETag: "0-5a85962945fc0"
X-Varnish: 5 3
Age: 2
Via: 1.1 varnish-v4
X-Varnish-Cache: HIT
Cache-Control: private,no-cache
Accept-Ranges: bytes

I’ve set up a new Drupal 8 site, created one node, and the HTTP headers show it’s not getting cached by Varnish.

With https://www.drupal.org/project/adv_varnish enabled, I get this:

$ curl -IL http://varnish.cx.docksal/node/1
HTTP/1.1 200 OK
Server: openresty/1.15.8.2
Date: Mon, 22 Jun 2020 09:02:06 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
X-Powered-By: PHP/7.3.11
X-Drupal-Dynamic-Cache: MISS
Link: <http://varnish.cx.docksal/node/1>; rel="canonical", <http://varnish.cx.docksal/node/1>; rel="shortlink", <http://varnish.cx.docksal/node/1>; rel="revision"
X-UA-Compatible: IE=edge
Content-language: en
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
Expires: Sun, 19 Nov 1978 05:00:00 GMT
X-Generator: Drupal 8 (https://www.drupal.org)
Vary: X-Bin
X-Grace: 10
X-TTL: 0
X-Tag: 32f5 de19 143b 864e 4f2e e312 1baa 1f8b 88c8 e062 678c 7c65 0b6e 93b3 7d01 76fd 7c7e 09a2 476d 4999 e459 f70b 4016 fd28 67aa 5ff8 a8bc bdb1
X-Adv-Varnish: Cache-enabled
X-Varnish-Secret: 648858c54f78622c9df4c549f87d71cda274420faedc3c5de92a856cb939a914
X-Deflate-Key: b91f02cf5af93c1c9141ba365d5852c9d12df6fd39f8b5d9c3b80dc1bc1428e1
X-Drupal-Cache: HIT
Last-Modified: Thu, 18 Jun 2020 15:38:17 GMT
ETag: "1592494697"
X-Varnish: 32772
Age: 0
Via: 1.1 varnish-v4
X-Varnish-Cache: MISS
Cache-Control: private,no-cache
Accept-Ranges: bytes

Without that module enabled, I get this:

$ curl -IL http://varnish.cx.docksal/node/1
HTTP/1.1 200 OK
Server: openresty/1.15.8.2
Date: Mon, 22 Jun 2020 09:15:40 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
X-Powered-By: PHP/7.3.11
X-Drupal-Dynamic-Cache: MISS
Link: <http://varnish.cx.docksal/node/1>; rel="canonical", <http://varnish.cx.docksal/node/1>; rel="shortlink", <http://varnish.cx.docksal/node/1>; rel="revision"
X-UA-Compatible: IE=edge
Content-language: en
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
Expires: Sun, 19 Nov 1978 05:00:00 GMT
X-Generator: Drupal 8 (https://www.drupal.org)
X-Drupal-Cache: HIT
X-Varnish: 98338
Age: 0
Via: 1.1 varnish-v4
X-Varnish-Cache: MISS
Cache-Control: private,no-cache
Accept-Ranges: bytes

In both cases, Varnish isn’t caching the page (I’m loading it twice, and I’d expect to have a Varnish cache hit the second time).

What am I doing wrong?

Can server-side caching misconfiguration lead to stolen logins?

If a webapp sends Cache-Control: private it shouldn’t be cached for example with nginx proxy_cache. What could happen if it was cached anyhow? Could another visitor see the personalized login of another user? Might another visitor then being logged in as another user?

reverse proxy – NGINX – Caching not working – Always MISS

I’m currently working on enabling caching on a NGINX (Reverse Proxy).

With this configuration, the value of X-Proxy-Cache is always MISS.

My current conf (in /etc/nginx/conf.d) :

proxy_cache_path /data/nginx/cache/MON_URL levels=1:2 keys_zone=MY_ZONE:8m max_size=1000m inactive=24h;

upstream opera {
    server MY_SERVER_IP:443;
}

server {
    listen 80;
    server_name MON_URL www.MON_URL;
    return 301 https://MON_URL$request_uri;

}

server {
    listen 443 ssl;
    server_name www.MON_URL;
    return 301 $scheme://MON_URL$request_uri;
}

server {
    listen 443 ssl;
    server_name MON_URL;

    location / {
        proxy_pass https://opera/;
        rewrite ^/$ /OPERA/ last;
        proxy_cache MY_ZONE;
        proxy_cache_valid any 24h;
        proxy_set_header Host $host;
        add_header X-Proxy-Cache $upstream_cache_status;
    }

    location /robots.txt {
        return 200 "User-agent: *nDisallow: /";
    }
}

I tried different configurations based on the doc or different posts here, without success.

I hope someone could help me move forward on this problem! Thanks in advance !

caching – How is data cache implemented in this case?

I have downloaded source code for GDBM. In its header there is the following commented typedef:

/* We want to keep from reading buckets as much as possible.  The following is
   to implement a bucket cache.  When full, buckets will be dropped in a
   least recently read from disk order.  */

/* To speed up fetching and "sequential" access, we need to implement a
   data cache for key/data pairs read from the file.  To find a key, we
   must exactly match the key from the file.  To reduce overhead, the
   data will be read at the same time.  Both key and data will be stored
   in a data cache.  Each bucket cached will have a one element data
   cache.  */

typedef struct
{
  int     hash_val;
  int     data_size;
  int     key_size;
  char    *dptr;
  size_t  dsize;
  int     elem_loc;
} data_cache_elem;

From the given comments I understand that this data struct will somehow increase the speed of
access to hash table’s elements by caching some of them. I just can’t understand how it can be done. Is there some special approach which allows the data to be explicitly cached as is noted in the comments? Or it is done by creating an usual static array. So far I can’t get the deatils from GDBM’s sources themselves because this project has lots of big-sized source files which are almost impossible for me to understand.

I will create eye caching book covers for $10

I will create eye caching book covers

*** I will Do Book Cover Design, Book Cover ***I  am a graphic designer, perfectly positioned to help you and your company to design a book cover. Through experience gained over time, I can create from scratch or turn your existing book cover into a kindle cover or other covers within the shortest interval of time. I specialize in increasing your reputation and credibility, through book cover design that will grab your readers attention What you will get from this gig titled :I will Do Book Cover Design, Book CoverA correctly formatted book coversPrint Ready Book cover that will stand out on any shelf.
All designs will match your existing book covers.
Delivery in JPG, PNG, and PDF (300dpi/CMYK)High resolution
unlimited of Revisions.
You might want to stop reading and hit the order button now and let’s get you on the bestseller list.AND If my design doesn’t suit you, you will be paid 40%.


THANK YOU

.(tagsToTranslate)REDESIGN(t)MINIMALIST(t)KINDLE(t)CHILDREN(t)OTHERS

WorkIng around TLS caching in Fiber-based Job systems

While MSVC provides a flag for “fiber-safe optimizations”, GCC and Clang do not (and even refuse to acknowledge the problem). So I ask, are there any workarounds that devs have figured out in the meantime around this issue; and what are they?

Select a design pattern for caching of data

I am working on the design of a system that caches the data on the hard disk in a structure like below:

Cache System (repository)

  • Station 1

  • Station 2

    • config.xml
    • map.jpg
    • agreement.manifest

So I need to cache several stations that each station may have one or more files related to it. for example station 1 doesn’t have a manifest file but station 2 does.
I think that using a composite pattern is the good one, but I cannot implement it well. in your opinion which design pattern is the best fitting for this sort of problem? it will be too helpful if you provide me a sample pseudo-code.

Thanks in advance

caching – why do my cache entries all double up with a #cache_redirect item?

I’m caching a piece of render array by adding cache keys to it. It’s getting cached correctly, but I see two rows for every instance of it.

One looks normal — contains the render array, with a cid that looks like:

CACHE_KEY:[languages:language_interface]=en:[languages:language_url]=en:[theme]=MYTHEME:[user.permissions]=HASH

The other one had data that starts like:

a:2:{s:15:"#cache_redirect";b:1;s:6:"#cache";a:5:{s:4:"keys";a:1 ...

and its CID is:

CACHE_KEY:[languages:language_interface]=en:[theme]=MYTHEME:[user.permissions]=HASH

It looks like this is something to do with language negotiation, but why, and is this normal? The page I am loading when I am testing this is using the language prefix, so I’m not getting redirected.

Caching or in-memory table in Azure for performance

I am building an Angular web application that retrieves part of its data from a Azure SQL Database table via APIs developed in Azure Functions (with Azure API Management at the API Gateway). The data in the table (30k records) do not change for at least 24 hours. The web app needs to display this data in a grid (table structure) with pagination and users can apply filter conditions to retrieve and show a subset of the data in the grid (again with pagination). They can also sort the data on a column in the grid. The web app will need to be accessed by few hundred users on their iPad/tablet with 3G internet speed. Keeping the latency in mind, I am considering one of these two options for optimum performance of the web app:

1) Cache all the records from the DB table in Azure Redis Cache with cache refresh every 24 hours, so that the application will fetch the data to populate the grid from the cache, thus avoiding expensive SQL DB disk I/O. However, I am not sure how the filtering based on a field value or range of values will happen from Redis Cache data. I have read about using Hash data type for storing multivalued objects in Redis and SortedSet for storing sorted data, but I am particularly not sure about filtering data in Redis based on the range of numeric values (similar to BETWEEN clause in SQL) in Redis Cache. Also, is it at all advisable to use Redis in this way for my use case?

2) Use in-memory OLTP (memory optimized table for this particular DB table) in Azure SQL DB for faster data retrieval. This will allow to handle the filtering and sorting requests from the web app with plain SQL queries. However, I am not sure if it’s appropriate to use memory optimized tables for improving just table read performance (from what I read, Microsoft suggests to use it for insert-heavy transactional operations).

Any comments or suggestions on the above two options or any other alternative way to achieve performance optimization?