Backport most significant upstream fixes (excl. hwsim fixes) Refreshed all patches. Contains important fixes for CSA (Channel Switch Announcement) and A-MSDU frames. [slightly altered to apply cleanly] Signed-off-by: Koen Vandeputte <koen.vandeputte@ncentric.com>
		
			
				
	
	
		
			87 lines
		
	
	
		
			3.3 KiB
		
	
	
	
		
			Diff
		
	
	
	
	
	
			
		
		
	
	
			87 lines
		
	
	
		
			3.3 KiB
		
	
	
	
		
			Diff
		
	
	
	
	
	
From: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
 | 
						|
Date: Fri, 31 Aug 2018 11:31:06 +0300
 | 
						|
Subject: [PATCH] mac80211: fix a race between restart and CSA flows
 | 
						|
 | 
						|
We hit a problem with iwlwifi that was caused by a bug in
 | 
						|
mac80211. A bug in iwlwifi caused the firwmare to crash in
 | 
						|
certain cases in channel switch. Because of that bug,
 | 
						|
drv_pre_channel_switch would fail and trigger the restart
 | 
						|
flow.
 | 
						|
Now we had the hw restart worker which runs on the system's
 | 
						|
workqueue and the csa_connection_drop_work worker that runs
 | 
						|
on mac80211's workqueue that can run together. This is
 | 
						|
obviously problematic since the restart work wants to
 | 
						|
reconfigure the connection, while the csa_connection_drop_work
 | 
						|
worker does the exact opposite: it tries to disconnect.
 | 
						|
 | 
						|
Fix this by cancelling the csa_connection_drop_work worker
 | 
						|
in the restart worker.
 | 
						|
 | 
						|
Note that this can sound racy: we could have:
 | 
						|
 | 
						|
driver   iface_work   CSA_work   restart_work
 | 
						|
+++++++++++++++++++++++++++++++++++++++++++++
 | 
						|
              |
 | 
						|
 <--drv_cs ---|
 | 
						|
<FW CRASH!>
 | 
						|
-CS FAILED-->
 | 
						|
              |                       |
 | 
						|
              |                 cancel_work(CSA)
 | 
						|
           schedule                   |
 | 
						|
           CSA work                   |
 | 
						|
                         |            |
 | 
						|
                        Race between those 2
 | 
						|
 | 
						|
But this is not possible because we flush the workqueue
 | 
						|
in the restart worker before we cancel the CSA worker.
 | 
						|
That would be bullet proof if we could guarantee that
 | 
						|
we schedule the CSA worker only from the iface_work
 | 
						|
which runs on the workqueue (and not on the system's
 | 
						|
workqueue), but unfortunately we do have an instance
 | 
						|
in which we schedule the CSA work outside the context
 | 
						|
of the workqueue (ieee80211_chswitch_done).
 | 
						|
 | 
						|
Note also that we should probably cancel other workers
 | 
						|
like beacon_connection_loss_work and possibly others
 | 
						|
for different types of interfaces, at the very least,
 | 
						|
IBSS should suffer from the exact same problem, but for
 | 
						|
now, do the minimum to fix the actual bug that was actually
 | 
						|
experienced and reproduced.
 | 
						|
 | 
						|
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
 | 
						|
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
 | 
						|
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
 | 
						|
---
 | 
						|
 | 
						|
--- a/net/mac80211/main.c
 | 
						|
+++ b/net/mac80211/main.c
 | 
						|
@@ -255,8 +255,27 @@ static void ieee80211_restart_work(struc
 | 
						|
 
 | 
						|
 	flush_work(&local->radar_detected_work);
 | 
						|
 	rtnl_lock();
 | 
						|
-	list_for_each_entry(sdata, &local->interfaces, list)
 | 
						|
+	list_for_each_entry(sdata, &local->interfaces, list) {
 | 
						|
+		/*
 | 
						|
+		 * XXX: there may be more work for other vif types and even
 | 
						|
+		 * for station mode: a good thing would be to run most of
 | 
						|
+		 * the iface type's dependent _stop (ieee80211_mg_stop,
 | 
						|
+		 * ieee80211_ibss_stop) etc...
 | 
						|
+		 * For now, fix only the specific bug that was seen: race
 | 
						|
+		 * between csa_connection_drop_work and us.
 | 
						|
+		 */
 | 
						|
+		if (sdata->vif.type == NL80211_IFTYPE_STATION) {
 | 
						|
+			/*
 | 
						|
+			 * This worker is scheduled from the iface worker that
 | 
						|
+			 * runs on mac80211's workqueue, so we can't be
 | 
						|
+			 * scheduling this worker after the cancel right here.
 | 
						|
+			 * The exception is ieee80211_chswitch_done.
 | 
						|
+			 * Then we can have a race...
 | 
						|
+			 */
 | 
						|
+			cancel_work_sync(&sdata->u.mgd.csa_connection_drop_work);
 | 
						|
+		}
 | 
						|
 		flush_delayed_work(&sdata->dec_tailroom_needed_wk);
 | 
						|
+	}
 | 
						|
 	ieee80211_scan_cancel(local);
 | 
						|
 
 | 
						|
 	/* make sure any new ROC will consider local->in_reconfig */
 |