Return-Path: <bug-hurd-bounces+4winter=informatik.uni-hamburg.de@gnu.org>
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on jade.jade-hamburg.de
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,
	TVD_RCVD_SPACE_BRACKET,T_RP_MATCHES_RCVD,UNPARSEABLE_RELAY autolearn=ham
	version=3.3.1
Received: from mailhost.informatik.uni-hamburg.de [134.100.9.70]
	by jade.jade-hamburg.de with IMAP (fetchmail-6.3.9-rc2)
	for <teythoon@localhost> (single-drop); Sun, 23 Nov 2014 17:39:39 +0100 (CET)
Received: from mailhost.informatik.uni-hamburg.de ([unix socket])
	 by mailhost (Cyrus v2.3.18) with LMTPA;
	 Sun, 23 Nov 2014 17:39:08 +0100
X-Sieve: CMU Sieve 2.3
Received: by mailhost.informatik.uni-hamburg.de (Postfix)
	id EE4BC7D7; Sun, 23 Nov 2014 17:39:08 +0100 (CET)
Delivered-To: 4winter@informatik.uni-hamburg.de
Received: from localhost (localhost [127.0.0.1])
	by mailhost.informatik.uni-hamburg.de (Postfix) with ESMTP id E79367D6
	for <4winter@informatik.uni-hamburg.de>; Sun, 23 Nov 2014 17:39:08 +0100 (CET)
X-Virus-Scanned: amavisd-new at informatik.uni-hamburg.de
Received: from mailhost.informatik.uni-hamburg.de ([127.0.0.1])
	by localhost (mailhost.informatik.uni-hamburg.de [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id qORqUwvkEX57 for <4winter@informatik.uni-hamburg.de>;
	Sun, 23 Nov 2014 17:39:02 +0100 (CET)
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17])
	(using TLSv1 with cipher AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by mailhost.informatik.uni-hamburg.de (Postfix) with ESMTPS id 167557D3
	for <4winter@informatik.uni-hamburg.de>; Sun, 23 Nov 2014 17:39:02 +0100 (CET)
Received: from eggs.gnu.org ([2001:4830:134:3::10]:35716)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from
	<SRS0=7W39=AN=ens-lyon.org=samuel.thibault@bounce.ens-lyon.org>)
	id 1XsaBP-0005iZ-WF
	for bug-hurd@gnu.org; Sun, 23 Nov 2014 11:38:51 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from
	<SRS0=7W39=AN=ens-lyon.org=samuel.thibault@bounce.ens-lyon.org>)
	id 1XsaBI-0006rG-79
	for bug-hurd@gnu.org; Sun, 23 Nov 2014 11:38:43 -0500
Received: from sonata.ens-lyon.org ([140.77.166.138]:58181)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from
	<SRS0=7W39=AN=ens-lyon.org=samuel.thibault@bounce.ens-lyon.org>)
	id 1XsaBH-0006qX-S2
	for bug-hurd@gnu.org; Sun, 23 Nov 2014 11:38:36 -0500
Received: from sonata.ens-lyon.org ([127.0.0.1])
	by localhost (sonata.ens-lyon.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id g66v24OvjEFt for <bug-hurd@gnu.org>;
	Sun, 23 Nov 2014 17:38:34 +0100 (CET)
Received: from type.youpi.perso.aquilenet.fr (youpi.perso.aquilenet.fr
	[80.67.176.89])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by sonata.ens-lyon.org (Postfix) with ESMTPSA id A72C2200A8
	for <bug-hurd@gnu.org>; Sun, 23 Nov 2014 17:38:33 +0100 (CET)
Received: from samy by type.youpi.perso.aquilenet.fr with local (Exim 4.84)
	(envelope-from <samuel.thibault@ens-lyon.org>) id 1XsaBE-0001BL-Ak
	for bug-hurd@gnu.org; Sun, 23 Nov 2014 17:38:32 +0100
Date: Sun, 23 Nov 2014 17:38:32 +0100
From: Samuel Thibault <samuel.thibault@gnu.org>
To: bug-hurd@gnu.org
Subject: Atomic file locking update
Message-ID: <20141123163832.GM4576@type.youpi.perso.aquilenet.fr>
Mail-Followup-To: bug-hurd@gnu.org
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.21+34 (58baf7c9f32f) (2010-12-30)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x
X-Received-From: 140.77.166.138
X-BeenThere: bug-hurd@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Bug reports for the GNU Hurd <bug-hurd.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/bug-hurd>,
	<mailto:bug-hurd-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/bug-hurd>
List-Post: <mailto:bug-hurd@gnu.org>
List-Help: <mailto:bug-hurd-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/bug-hurd>,
	<mailto:bug-hurd-request@gnu.org?subject=subscribe>
Errors-To: bug-hurd-bounces+4winter=informatik.uni-hamburg.de@gnu.org
Sender: bug-hurd-bounces+4winter=informatik.uni-hamburg.de@gnu.org

Hello,

While Svante is working on the record locking, I have worked on at
least fixing whole file locking: there was one bug in our current
implementation: flock(LOCK_SH); flock(LOCK_EX);, as per POSIX, does not
guarantee an atomic upgrade from LOCK_SH to LOCK_EX. But
fcntl(SETLK,F_RDLCK); fcntl(SETLK,F_WRLCK); is supposed to guarantee an
atomic upgrade from F_RDLCK to F_WRLCK.

I have thus added a __LOCK_ATOMIC flag, to be used along LOCK_SH
and LOCK_EX, to guarantee atomic upgrades and downgrades, and then
implemented the support in libfshelp. To avoid starvation, my patch
makes the thread which wants to upgrade to LOCK_EX set the lock type
to LOCK_SH|LOCK_EX, and others manage that, to prevent newer LOCK_SH
lockers.

I have successfully tested it with tdb's tdbtorture up to 5 processes
for now, and it works fine, but perhaps somebody could proofread the
changes again, to make sure we aren't forgetting any scenario?

Samuel

diff --git a/libfshelp/lock-acquire.c b/libfshelp/lock-acquire.c
index 574bc5c..dceee5c 100644
--- a/libfshelp/lock-acquire.c
+++ b/libfshelp/lock-acquire.c
@@ -23,17 +23,30 @@ the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.  */
 
 #define EWOULDBLOCK EAGAIN /* XXX */
 
+/* Remove this when glibc has it.  */
+#ifndef __LOCK_ATOMIC
+#define __LOCK_ATOMIC 16
+#endif
+
 error_t
 fshelp_acquire_lock (struct lock_box *box, int *user, pthread_mutex_t *mut,
 		     int flags)
 {
+  int atomic = 0;
+
   if (!(flags & (LOCK_UN | LOCK_EX | LOCK_SH)))
     return 0;
   
   if ((flags & LOCK_UN)
       && (flags & (LOCK_SH | LOCK_EX)))
     return EINVAL;
-  
+
+  if (flags & __LOCK_ATOMIC)
+    {
+      atomic = 1;
+      flags &= ~__LOCK_ATOMIC;
+    }
+
   if (flags & LOCK_EX)
     flags &= ~LOCK_SH;
   
@@ -44,8 +57,10 @@ fshelp_acquire_lock (struct lock_box *box, int *user, pthread_mutex_t *mut,
       if (*user & LOCK_UN)
 	return 0;
 
-      assert (*user == box->type);
-      assert (*user == LOCK_SH || *user == LOCK_EX);
+      assert (*user == box->type ||
+	      (*user == LOCK_SH && box->type == (LOCK_SH | LOCK_EX)));
+      assert (*user == LOCK_SH || *user == LOCK_EX ||
+	      *user == (LOCK_SH | LOCK_EX));
 
       if (*user == LOCK_SH)
 	{
@@ -60,12 +75,39 @@ fshelp_acquire_lock (struct lock_box *box, int *user, pthread_mutex_t *mut,
 	  box->waiting = 0;
 	  pthread_cond_broadcast (&box->wait);
 	}
+      if (box->type == (LOCK_SH | LOCK_EX) && box->shcount == 1 && box->waiting)
+	{
+	  box->waiting = 0;
+	  pthread_cond_broadcast (&box->wait);
+	}
       *user = LOCK_UN;
     }
   else
     {
+      if (atomic && *user == (flags & (LOCK_SH | LOCK_EX)))
+	/* Already have it the right way. */
+	return 0;
+
+      if (atomic && *user == LOCK_EX && flags & LOCK_SH)
+	{
+	  /* Easy atomic change. */
+	  *user = LOCK_SH;
+	  box->type = LOCK_SH;
+	  box->shcount = 1;
+	  if (box->waiting)
+	    {
+	      box->waiting = 0;
+	      pthread_cond_broadcast (&box->wait);
+	    }
+	  return 0;
+	}
+
+      /* We can not have two people upgrading their lock, this is a deadlock! */
+      if (*user == LOCK_SH && atomic && box->type == (LOCK_SH | LOCK_EX))
+	return EDEADLK;
+
       /* If we have an exclusive lock, release it. */
-      if (*user == LOCK_EX)
+      if (*user == LOCK_EX && !atomic)
 	{
 	  *user = LOCK_UN;
 	  box->type = LOCK_UN;
@@ -75,19 +117,9 @@ fshelp_acquire_lock (struct lock_box *box, int *user, pthread_mutex_t *mut,
 	      pthread_cond_broadcast (&box->wait);
 	    }
 	}
-      
-      /* If there is an exclusive lock, wait for it to end. */
-      while (box->type == LOCK_EX)
-	{
-	  if (flags & LOCK_NB)
-	    return EWOULDBLOCK;
-	  box->waiting = 1;
-	  if (pthread_hurd_cond_wait_np (&box->wait, mut))
-	    return EINTR;
-	}
 
       /* If we have a shared lock, release it. */
-      if (*user == LOCK_SH)
+      if (*user == LOCK_SH && !atomic)
 	{
 	  *user = LOCK_UN;
 	  if (!--box->shcount)
@@ -99,12 +131,29 @@ fshelp_acquire_lock (struct lock_box *box, int *user, pthread_mutex_t *mut,
 		  pthread_cond_broadcast (&box->wait);
 		}
 	    }
+	  if (box->type == (LOCK_SH | LOCK_EX) && box->shcount == 1 &&
+	      box->waiting)
+	    {
+	      box->waiting = 0;
+	      pthread_cond_broadcast (&box->wait);
+	    }
 	}
-      
+
+      /* If there is another exclusive lock or a pending upgrade, wait for it to
+         end. */
+      while (box->type & LOCK_EX)
+	{
+	  if (flags & LOCK_NB)
+	    return EWOULDBLOCK;
+	  box->waiting = 1;
+	  if (pthread_hurd_cond_wait_np (&box->wait, mut))
+	    return EINTR;
+	}
+
       assert ((flags & LOCK_SH) || (flags & LOCK_EX));
       if (flags & LOCK_SH)
 	{
-	  assert (box->type != LOCK_EX);
+	  assert (!(box->type & LOCK_EX));
 	  *user = LOCK_SH;
 	  box->type = LOCK_SH;
 	  box->shcount++;
@@ -112,17 +161,27 @@ fshelp_acquire_lock (struct lock_box *box, int *user, pthread_mutex_t *mut,
       else if (flags & LOCK_EX)
 	{
 	  /* Wait for any shared (and exclusive) locks to finish. */
-	  while (box->type != LOCK_UN)
+	  while ((*user == LOCK_SH && box->shcount > 1) ||
+		 (*user == LOCK_UN && box->type != LOCK_UN))
 	    {
 	      if (flags & LOCK_NB)
 		return EWOULDBLOCK;
 	      else
 		{
+		  /* Tell others that we are upgrading.  */
+		  if (*user == LOCK_SH && atomic)
+		    box->type = LOCK_SH | LOCK_EX;
+
 		  box->waiting = 1;
 		  if (pthread_hurd_cond_wait_np (&box->wait, mut))
 		    return EINTR;
 		}
 	    }
+	  if (*user == LOCK_SH)
+	    {
+	      assert (box->shcount == 1);
+	      box->shcount = 0;
+	    }
 	  box->type = LOCK_EX;
 	  *user = LOCK_EX;
 	}